# Partitioned Parallel Radix Sort

@article{Lee2000PartitionedPR, title={Partitioned Parallel Radix Sort}, author={Shindong Lee and Minsoo Jeon and Dongseung Kim and Andrew Sohn}, journal={J. Parallel Distributed Comput.}, year={2000}, volume={62}, pages={656-668} }

Load balanced parallel radix sort solved the load imbalance problem present in parallel radix sort. Redistributing the keys in each round of radix, each processor has exactly the same number of keys, thereby reducing the overall sorting time. Load balanced radix sort is currently known the fastest internal sorting method for distributed-memory multiprocessors. However, as the computation time is balanced, the communication time emerges as the bottleneck of the overall sorting performance due to…

## Figures and Tables from this paper

## 53 Citations

### PARADIS: An Efficient Parallel Algorithm for In-place Radix Sort

- Computer ScienceProc. VLDB Endow.
- 2015

A novel parallel in-place radix sort algorithm, PARADIS, which addresses both problems: a "speculative permutation" solves the first problem by assigning multiple non-continuous array stripes to each processor, and the resulting shared-nothing scheme achieves full parallelization.

### Load-balanced parallel merge sort on distributed memory parallel computers

- Computer ScienceProceedings 16th International Parallel and Distributed Processing Symposium
- 2002

Sort can be speeded up on parallel computers by dividing and computing data individually in parallel. Merge sort can be parallelized, however, the conventional algorithm implemented on distributed…

### Parallel Merge Sort with Load Balancing

- Computer ScienceInternational Journal of Parallel Programming
- 2004

The proposed load-balanced merge sort utilizes all processors throughout the computation and evenly distributes data to all processors in each stage, thereby every processor is forced to work in all phases.

### A full parallel radix sorting algorithm for multicore processors

- Computer Science
- 2011

This paper introduces PARL, a parallel left radix sorting algorithm for use on ordinary shared memory multi core machines, that has just one simple statement in its sequential part and is user friendly in that the user calling this PARL algorithm does not have to deal with creating threads themselves; to sort their data, they just create a sorting object and make a call to a thread safe method in that object.

### Parallelizing Merge Sort onto Distributed Memory Parallel Computers

- Computer ScienceISHPC
- 2002

Merge sort is useful in sorting a great number of data progressively, especially when they can be partitioned and easily collected to a few processors. Merge sort can be parallelized, however,…

### Theoretically-Efficient and Practical Parallel In-Place Radix Sorting

- Computer ScienceSPAA
- 2019

The performance of Regions Sort is compared to existing parallel in-place and out-of-place sorting algorithms on a variety of input distributions and shown to be faster than optimized out- of-place radix sorting and comparison sorting algorithms.

### High-speed parallel external sorting of data with arbitrary distribution

- Computer ScienceInt. J. High Perform. Comput. Netw.
- 2004

Two distribution-insensitive scalable parallel external sorting algorithms that use sampling technique and histogram counts to achieve even distribution of keys, which eventually contribute to achieve good performance are developed.

### Modified Pure Radix Sort for Large Heterogeneous Data Set

- Computer Science
- 2012

This research paper presents a new modified pure radix sort algorithm for large heterogeneous data set, which tries to optimize all related problems of radixsort through this algorithm.

### Communication-efficient bitonic sort on a distributed memory parallel computer

- Computer ScienceProceedings. Eighth International Conference on Parallel and Distributed Systems. ICPADS 2001
- 2001

An efficient way of data communication in bitonic sort to minimize the interprocessor communication and comparison time is presented and it is believed the scheme is a good way to shorten the communication time in similar applications.

### A Load-Balanced Parallel and Distributed Sorting Algorithm Implemented with PGX.D

- Computer Science2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
- 2017

Sorting has been one of the most challenging studied problems in different scientific researches. Although many techniques and algorithms have been proposed on the theory of having efficient parallel…

## References

SHOWING 1-10 OF 21 REFERENCES

### Load balanced parallel radix sort

- Computer ScienceICS '98
- 1998

Experimental results indicate that balanced radix sort can sort OSG integers in 20 seconds and 128M doubles in 15 seconds on a 64-processor SPZWN while yielding over 40-fold speedup.

### Communication-efficient bitonic sort on a distributed memory parallel computer

- Computer ScienceProceedings. Eighth International Conference on Parallel and Distributed Systems. ICPADS 2001
- 2001

An efficient way of data communication in bitonic sort to minimize the interprocessor communication and comparison time is presented and it is believed the scheme is a good way to shorten the communication time in similar applications.

### Parallel algorithms for personalized communication and sorting with an experimental study (extended abstract)

- Computer ScienceSPAA '96
- 1996

A novel variation on sample sort which uses only two rounds of regular all-to-all personalized communication in a scheme that yields very good load balancing with virtually no overhead and performance is invariant over the set of input distributions unlike previous efficient algorithms.

### Fast Parallel Sorting Under LogP: Experience with the CM-5

- Computer ScienceIEEE Trans. Parallel Distributed Syst.
- 1996

The LogP model is shown to be a valuable guide in the development of parallel algorithms and a good predictor of implementation performance; the model encourages the use of data layouts which minimize communication and balanced communication schedules which avoid contention.

### Minimizing Communication in the Bitonic Sort

- Computer ScienceIEEE Trans. Parallel Distributed Syst.
- 2000

A recirculating bitonic sorting network is presented, which is composed of one level of N/2 comparators plus an /spl Omega/-network of (log N-1) switch levels, which reduces the cost complexity to O(N log N) compared with the O (N log/sup 2/ N) of the original bitonic sorted network, while preserving the same time complexity.

### Parallel sorting and data partitioning by sampling

- Computer Science
- 1983

The analysis is developed for parallel sorting in a local network environment, with distributed data sets in secondary storage devices, and a data partitioning method by sampling is proposed.

### Identifying the capability of overlapping computation with communication

- Computer ScienceProceedings of the 1996 Conference on Parallel Architectures and Compilation Technique
- 1996

Experimental results indicate that both multiprocessors would yield up to 30% to 40% overlap of communication time when the message size is approximately 1K integers.

### Tight Bounds on the Complexity of Parallel Sorting

- Computer Science, MathematicsIEEE Transactions on Computers
- 1985

Tight upper and lower bounds are proved on the number of processors, information transfer, wire area, and time needed to sort N numbers in a bounded-degree fixed-connection network.

### Sorting networks and their applications

- Computer ScienceAFIPS '68 (Spring)
- 1968

To achieve high throughput rates today's computers perform several operations simultaneously. Not only are I/O operations performed concurrently with computing, but also, in multiprocessors, several…