• Corpus ID: 2262776

A Low Communication Sort Algorithm for a Parallel Database Machine

@inproceedings{Lorie1989ALC,
  title={A Low Communication Sort Algorithm for a Parallel Database Machine},
  author={Raymond A. Lorie and Honesty C. Young},
  booktitle={VLDB},
  year={1989}
}
The paper considers the prcblem of sorting a file in a distributed system. The file is originally distributed on many sites, and the result of the sort is needed at another site called the “host”. The particular environment that we resume is a backend parallel database machine, but the work is applicable to distributed database systems as well. After discussing the drawbacks of several existing algorithms, we propose a novel algorithm that exhibits complete parallelism during the sort, merge… 

Figures and Tables from this paper

Exploiting database parallelism in a message-passing multiprocessor
TLDR
This paper focuses entirely on exploiting parallel-processing configurations in which general-purpose processors communicate only via message passing, and in this configuration, the database is partitioned among the processors.
A practical external sort for shared disk MPPs
TLDR
The implementation of the sample sort algorithm described here meets the requirements of real world constraints and is suitable for shared disk MPP computer systems.
PPS-a parallel partition sort algorithm for multiprocessor database systems
TLDR
Experimental results demonstrate that the new algorithm performs better than existing parallel range partition sorting algorithms in a shared-nothing database environment for a wide degree of skew.
The parameterized Round-Robin partitioned algorithm for parallel external sort
  • H. Young, A. Swami
  • Computer Science
    Proceedings of 9th International Parallel Processing Symposium
  • 1995
TLDR
A new parameterized parallel sort algorithm, called Round-Robin Partitioned (or RRP), for the message passing (shared-nothing) architecture and is shown to be superior to the other algorithms for almost all configurations.
A practical external sort for shared disk MPP's
TLDR
The implementation of the sample sort algorithm described here meets the requirements of real world constraints and is suitable for shared disk MPP computer systems.
Sorting Large Data Files on POOMA
TLDR
The results show that the benchmark is able to exploit the full capabilities of the computing power, the storage devices and the communication bandwith and the applicability of the POOMA platform for this application, even where the POOL implementation was, at the time of the experiment, far from optimal.
Parallel Sorting of Large Data Volumes on Distributed Memory Multiprocessors
TLDR
This algorithm is suited for large data volumes (external sorting) and does not suffer from processing skew in presence of data skew and the optimal degree of CPU parallelism is derived if I/O limitations are taken into account.
Parallel Sorting of Large Data Volumes on Distributed Memory Multiprocessors
TLDR
This algorithm is suited for large data volumes (external sorting) and does not suffer from processing skew in presence of data skew and the optimal degree of CPU parallelism is derived if I/O limitations are taken into account.
Overlapping Computations, Communications and I/O in parallel Sorting
TLDR
A new parallel sorting algorithm which maximizes the overlap between the disk, network, and CPU subsystems of a processing node is presented, which is shown to be of similar complexity to known efficient sorting algorithms.
...
1
2
3
4
...

References

SHOWING 1-10 OF 14 REFERENCES
Sorting Large Files on a Backend Multiprocessor
TLDR
The results show that using current, off-the-shelf technology coupled with a streamlined distributed operating system, three- and five-microprocessor configurations, provide a very cost-effective sort of large files.
A taxonomy of parallel sorting
TLDR
This paper analyzes the evolution of research on parallel sorting, from the earliest sorting networks to the shared memory algorithms and the VLSI sorters, and proposes a taxonomy of parallel sorting that includes a broad range of array and file sorting algorithms.
Parallel Sorting Algorithms
An evaluation of sorting algorithms for common-bus local networks
TLDR
This paper evaluates four alternate methods of performing external sort in common-bus local networks by observing its behavior at different network speeds, file sizes, network sizes, page sizes, I/O times, and interrupt and synchronization times.
An Adaptive Method for Unknown Distributions in Distributive Partitioned Sorting
TLDR
An adaptation of DPS, which estimates the cumulative distribution function of the input data from a randomly selected sample, was developed and tested, and runs only 2-4 percent slower than DPS in the uniform case, but outperforms DPS by 12-13 percent on exponentially distributed data for sufficiently large files.
Parallelism in tape-sorting
TLDR
Two methods for employing parallelism in tape-sorting are presented and both approximately achieve the goal of reducing the processing time by a divisor which is the number of processors.
Adding Intra-transaction Parallelism to an Existing DBMS: Early Experience
  • IEEE Data Eng. Bull.
  • 1989
An Evaluation of Sorting Algorithms for Common-Bus Local Networks
...
1
2
...