Parallel merge sort

  title={Parallel merge sort},
  author={Richard J. Cole},
  journal={27th Annual Symposium on Foundations of Computer Science (sfcs 1986)},
  • R. Cole
  • Published 1 August 1988
  • Computer Science
  • 27th Annual Symposium on Foundations of Computer Science (sfcs 1986)
We give a parallel implementation of merge sort on a CREW PRAM that uses n processors and O(logn) time; the constant in the running time is small. We also give a more complex version of the algorithm for the EREW PRAM; it also uses n processors and O(logn) time. The constant in the running time is still moderate, though not as small. 
Towards Optimal Parallel Bucket Sorting
A Parallel Bucket Sort
Time-Space Optimal Parallel Merging and Sorting
The authors present a parallel merging algorithm that, on an exclusive-read exclusive-write (EREW) parallel random-access machine (PRAM) with k processors merges two sorted lists of total length n in
Parallel Iterated Bucket Sort
Sorting Roughly Sorted Sequences in Parallel
Optimal Merging and Sorting on the Erew Pram
Hybridsort Revisited and Parallelized
A parallel priority data structure with applications
A parallel priority data structure that improves the running time of certain algorithms for problems that lack a fast and work-efficient parallel solution and can be supported in O(1) time.
A Note on Adaptive Parallel Sorting
Parallel heap: improved and simplified
  • S. Prasad, N. Deo
  • Computer Science
    Proceedings Sixth International Parallel Processing Symposium
  • 1992
This version of the data structure parallel heap does not require dedicated maintenance processors, and performs insertion and deletion in place, and can efficiently utilize processors in the range 1 through n.


Sorting inc logn parallel steps
A sorting network withcn logn comparisons where in thei-th step of the algorithm the contents of registersRj, andRk, wherej, k are absolute constants then change their contents or not according to the result of the comparison.
Parallelism in Comparison Problems
The worst-case time complexity of algorithms for multiprocessor computers with binary comparisons as the basic operations is investigated and the algorithm for finding the maximum is shown to be optimal for all values of k and n.
Searching, Merging, and Sorting in Parallel Computation
  • C. Kruskal
  • Computer Science
    IEEE Transactions on Computers
  • 1983
A merging algorithm is presented that is optimal up to a constant factor when merging two lists of equal size (independent of the number of processors); as a special case, with N processors it merges two lists, each of size N, in 1.893 lg lg N + 4 comparison steps.
New Parallel-Sorting Schemes
  • F. Preparata
  • Computer Science
    IEEE Transactions on Computers
  • 1978
A family of parallel-sorting algorithms for a multiprocessor system that is enumeration sortings and includes the use of parallel merging to implement count acquisition, matching the performance of Hirschberg's algoithm, which, however, is not free of fetch conflicts.
Applying parallel computation algorithms in the design of serial algorithms
  • N. Megiddo
  • Computer Science
    22nd Annual Symposium on Foundations of Computer Science (sfcs 1981)
  • 1981
It is pointed out that analyses of parallelism in computational problems have practical implications even when multi-processor machines are not available, and a unified framework for cases like this is presented.
Tight Bounds on the Complexity of Parallel Sorting
  • F. Leighton
  • Computer Science, Mathematics
    IEEE Transactions on Computers
  • 1985
Tight upper and lower bounds are proved on the number of processors, information transfer, wire area, and time needed to sort N numbers in a bounded-degree fixed-connection network.
Routing, merging and sorting on parallel models of computation
It is shown that log log n - log log r is asymptotically optimal for rn processors to merge two sorted lists of n elements and is able to achieve such an efficient sort via Valiant's parallel merging algorithm.
Improved upper bounds on shellsort
  • J. Incerpi, R. Sedgewick
  • Computer Science
    24th Annual Symposium on Foundations of Computer Science (sfcs 1983)
  • 1983
The running time of Shellsort, with the number of passes restricted to O(log N), was thought for some time to be Θ(N3/2), but a different approach is used to achieve O(N1+4/√2lgN).
Sorting networks and their applications
To achieve high throughput rates today's computers perform several operations simultaneously. Not only are I/O operations performed concurrently with computing, but also, in multiprocessors, several