The input/output complexity of sorting and related problems

@article{Aggarwal1988TheIC,
  title={The input/output complexity of sorting and related problems},
  author={Alok Aggarwal and Jeffrey Scott Vitter},
  journal={Commun. ACM},
  year={1988},
  volume={31},
  pages={1116-1127}
}
We provide tight upper and lower bounds, up to a constant factor, for the number of inputs and outputs (I/OS) between internal memory and secondary storage required for five sorting-related problems: sorting, the fast Fourier transform (FFT), permutation networks, permuting, and matrix transposition. The bounds hold both in the worst case and in the average case, and in several situations the constant factors match. Secondary storage is modeled as a magnetic disk capable of transferring P… 

Figures from this paper

Algorithms for parallel memory, I: Two-level memories

We provide the first optimal algorithms in terms of the number of input/outputs (I/Os) required between internal memory and multiple secondary storage devices for the problems of sorting, FFT, matrix

Lower bounds for external memory integer sorting via network coding

A tight conditional lower bound on the complexity of external memory sorting of integers is presented, based on a famous conjecture in network coding by Li and Li, who conjectured that network coding cannot help anything beyond the standard multicommodity flow rate in undirected graphs.

Sequence sorting in secondary storage

The results show, somewhat counterintuitively, that the I/O complexity of string sorting depends upon the length of the strings relative to the block size.

Large-scale sorting in parallel memories (extended abstract)

An elegant, easy-toimplement, optimal, deterministic algorithm for external sorting with P disk drives is presented, which answers the open problem posed by Vitter and Shriver.

Lower bounds for external memory integer sorting via network coding

A tight conditional lower bound on the complexity of external memory sorting of integers is presented, based on a famous conjecture in network coding by Li and Li (2004), who conjectured that network coding cannot help anything beyond the standard multicommodity flow rate in undirected graphs.

A Framework for Simple Sorting Algorithms on Parallel Disk Systems

A simple parallel sorting algorithm is presented and it is proved that it can get a sparse enumeration sort on the hypercube that is simpler than that of the classical algorithm of Nassimi and Sahni.

Optimal and Practical Algorithms for Sorting on the PDM

A randomized mergesort algorithm based on a simple idea that sorts using an asymptotically optimal number of I/O operations with high probability and has all of the desirable features for practical implementation is presented.

Algorithms and Data Structures for External Memory

  • J. Vitter
  • Computer Science
    Found. Trends Theor. Comput. Sci.
  • 2006
The state of the art in the design and analysis of algorithms and data structures for external memory (or EM for short), where the goal is to exploit locality and parallelism in order to reduce the I/O costs is surveyed.

Efficient bundle sorting

An efficient algorithm for bundle sorting in external memory, which requires at most c(N/B) logM/Bk disk accesses, and is shown to be optimal by proving a matching lower bound for bundling together identical keys.

Minimizing the input/output bottleneck

This thesis gives the first known algorithms for sorting efficiently in single Uniform Memory Hierarchy, and shows how to achieved optimal I/O performance of VLSI implementations of lattice computations by transferring less information, and gives matching upper and lower bounds.
...

References

SHOWING 1-10 OF 12 REFERENCES

Tight Bounds on the Complexity of Parallel Sorting

  • F. Leighton
  • Computer Science, Mathematics
    IEEE Transactions on Computers
  • 1985
Tight upper and lower bounds are proved on the number of processors, information transfer, wire area, and time needed to sort N numbers in a bounded-degree fixed-connection network.

I/O complexity: The red-blue pebble game

Using the red-blue pebble game formulation, a number of lower bound results for the I/O requirement are proven and may provide insight into the difficult task of balancing I/o and computation in special-purpose system designs.

The Design and Analysis of BucketSort for Bubble Memory Secondary Storage

A hypothetical Bucket-Sort implementation that uses bubble memory is described and a new software marking technique is introduced that reduces the effective time for an associative search.

Permuting Information in Idealized Two-Level Storage

  • R. W. Floyd
  • Mathematics
    Complexity of Computer Computations
  • 1972
Assume a computer with a (relatively) slow and large memory consisting of pages, each with a capacity of p records. Available operations for manipulating information in slow memory are limited to

Time Bounds for Selection

The Universality of the Shuffle-Exchange Network

The inherent relationship between the shuffle-exchange network and the Benes binary network is specified so that designers can have a broad prospect.

The I/O Performance of Multiway Mergesort and Tag Sort

These models of secondary storage are developed to evaluate external sorting and use them to analyze the average I/O access time of mergesort and tag sort on files with uniform key distribution and it is shown that for large files tag sort takes asymptotically less I/W time than mergesorts.

The Art of Computer Programming, Volume III: Sorting and Searching

Parallelism in space-time tradeoffs

  • Advances in Computing Research Special issue on Parallel and Disfributed Computing
  • 1987