# Robust Massively Parallel Sorting

@inproceedings{Axtmann2017RobustMP, title={Robust Massively Parallel Sorting}, author={Michael Axtmann and Peter Sanders}, booktitle={ALENEX}, year={2017} }

We investigate distributed memory parallel sorting algorithms that scale to the largest available machines and are robust with respect to input size and distribution of the input elements. The main outcome is that four sorting algorithms cover the entire range of possible input sizes. For three algorithms we devise new low overhead mechanisms to make them robust with respect to duplicate keys and skewed input distributions. One of these, designed for medium sized inputs, is a new variant of…

## 15 Citations

### Parallel Quicksort without Pairwise Element Exchange

- Computer Science
- 2018

It is shown that with good pivot selection, Quicksort without pairwise element exchange can be significantly faster than standard implementations on moderately large problems, and for smaller input sizes, standard and exchange-free variants can be combined to exploit the exchangefree variant as subproblems become large enough relative to the number of processors.

### Scalable String and Suffix Sorting: Algorithms, Techniques, and Tools

- Computer ScienceArXiv
- 2018

This dissertation focuses on two fundamental sorting problems: string sorting and suffix sorting, and proposes both multiway distribution-based with string sample sort and multiway merge-based string sorting with LCP-aware merge and mergesort, and engineer and parallelize both approaches.

### Parallel Quicksort without Pairwise Element Exchange

- Computer ScienceArXiv
- 2018

A template implementation is given that reduces the total volume of data exchanged from O(n\log p) to $O(n)$, $n$ being the total number of elements to be sorted and $p$ a power-of-two number of processors, while preserving the flavor, characteristics and properties of a Quicksort implementation.

### Communication-Efficient String Sorting

- Computer Science2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
- 2020

These algorithms inspect only characters that are needed to determine the sorting order and communication volume is reduced by also communicating only those characters and by communicating repetitions of the same prefixes only once.

### OLAPS: Online load-balancing in range-partitioned main memory database with approximate partition statistics

- Computer ScienceComput. Sci. Inf. Syst.
- 2018

This paper proposes an approach for maintaining balanced loads over a set of nodes as in a system of communicating vessels, by migrating tuples between neighboring nodes, based on an approximate Partition Statistics Table.

### Engineering faster sorters for small sets of items

- Computer ScienceSoftw. Pract. Exp.
- 2021

The results clearly show the potential of using conditional moves in the field of sorting algorithms, as when sorting only small sets of integers, the sorting networks outperform insertion sort.

### Massively Parallel ’ Schizophrenic ’ Quicksort

- Computer Science
- 2017

A communication library based on MPI is presented that supports communicator creation in constant time and without communication and the first efficient implementation of Schizophrenic Quicksort, a recursive sorting algorithm for distributed memory systems that is based on Quicksorts is presented.

### Parallel quicksort algorithm on OTIS hyper hexa-cell optoelectronic architecture

- Computer ScienceJ. Parallel Distributed Comput.
- 2020

### Connected Components on a PRAM in Log Diameter Time

- Computer ScienceSPAA
- 2020

This work presents an O(log d + log logm/n n)-time randomized PRAM algorithm for computing the connected components of an n-vertex, m-edge undirected graph with maximum component diameter d and suggests that additional power might not be necessary for fundamental graph problems like connected components and spanning forest.

### Decentralized Online Scheduling of Malleable NP-hard Jobs

- Computer Science, BusinessEuro-Par
- 2022

This work addresses an online job scheduling problem in a large distributed computing environment, using the NP-complete problem of propositional satisﬁability (SAT) as a case study, and shows that its approach leads to near-optimal utilization, imposes minimal computational overhead, and performs fair scheduling of incoming jobs within a few milliseconds.

## References

SHOWING 1-10 OF 37 REFERENCES

### A Randomized Parallel Sorting Algorithm with an Experimental Study

- Computer ScienceJ. Parallel Distributed Comput.
- 1998

A novel variation on sample sort which uses only two rounds of regular all-to-all personalized communication in a scheme that yields very good load balancing with virtually no overhead, and its performance is invariant over the set of input distributions unlike previous efficient algorithms.

### Practical Massively Parallel Sorting

- Computer ScienceSPAA
- 2015

The algorithms are multi-level generalizations of the known algorithms sample sort and multiway mergesort, which turns out to be very scalable both in theory and practice where it scales up to 215 MPI processes with outstanding performance in particular for medium sized inputs.

### Highly scalable parallel sorting

- Computer Science2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)
- 2010

A scalable extension of the Histogram Sorting method is presented, making fundamental modifications to the original algorithm in order to minimize message contention and exploit overlap.

### On the Efficient Implementation of Massively Parallel Quicksort

- Computer Science
- 1997

A high performance variant of parallel Quicksort which incorporates the following optimizations: Stop the recursion at the right time, sort locally rst, use accurate yet eecient pivot selection strategies, streamline communication patterns, use locality preserving processor indexing schemes and work with multiple pivots at once.

### Efficient Massively Parallel Quicksort

- Computer ScienceIRREGULAR
- 1997

This work has implemented a high performance variant of parallel quicksort which incorporates the following optimizations: Stop the recursion at the right time, sort locally first, use accurate yet efficient pivot selection strategies, streamline communication patterns, use locality preserving processor indexing schemes and work with multiple pivots at once.

### HykSort: a new variant of hypercube quicksort on distributed memory architectures

- Computer ScienceICS '13
- 2013

HekSort is an optimized comparison sort for distributed memory architectures that attains more than 2× improvement over bitonic sort and samplesort and also presents a staged communication samplesort, which is more robust than the original samplesort for large core counts.

### Parallel Quicksort in hypercubes

- Computer ScienceSAC '92
- 1992

A new parallel algorithm, named Cubequic&sort, which is modified from Hyperquicksort, which has a better performance than the other three algorithms and makes a better estimations of median keys to ensure a more balanced key distribution among the processor nodes.

### A comparison of sorting algorithms for the connection machine CM-2

- Computer ScienceSPAA '91
- 1991

A fast sorting algorithm for the Connection Machine Supercomputer model CM-2 is developed and it is shown that any U(lg n)-depth family of sorting networks can be used to sort n numbers in U( lg n) time in the bounded-degree fixed interconnection network domain.

### Super Scalar Sample Sort

- Computer ScienceESA
- 2004

The main algorithmic insight is that element comparisons can be decoupled from expensive conditional branching using predicated instructions, which facilitates optimizations like loop unrolling and software pipelining.

### Resource Oblivious Sorting on Multicores

- Computer ScienceICALP
- 2010

A deterministic sorting algorithm, Sample, Partition, and Merge Sort (SPMS), that interleaves the partitioning of a sample sort with merging and sorts n elements in O(nlog n) time cache-obliviously with an optimal number of cache misses.