# Parallel Sorting Methods for Large Data Volumes on a Hypercube Database Computer

@inproceedings{Baugst1989ParallelSM, title={Parallel Sorting Methods for Large Data Volumes on a Hypercube Database Computer}, author={Bj{\o}rn Arild W. Baugst{\o} and Jarle Fredrik Greipsland}, booktitle={IWDM}, year={1989} }

Sorting is one of the basic operations in any database system. In this paper we present two external sorting algorithms for hypercube database computers. The methods are based on partitioning of data according to partition values obtained through sampling of the data. One of the algorithms which is implemented at the HC16 database computer designed at The Norwegian Institute of Technology, is described in detail together with a performance evaluation and a presentation of some test results.

## 34 Citations

Multiprocessor algorithms for relational-database operators on hypercube systems

- Computer ScienceComputer
- 1990

This tutorial focuses on hypercube interconnected architectures as a computational engine for relational-database processing and experiments obtained from a portable hypercube-based database system are presented to characterize performance potential for various uniscan and multiscan operations.

Parallel Relational Database Algorithms

- Computer Science
- 1993

The paper describes two classes of algorithms to perform relational database operations in parallel on a distributed memory parallel computer with a disk for each processor and shows how a bucket algorithm can be used to sort a relation.

Parallel Sorting of Large Data Volumes on Distributed Memory Multiprocessors

- Computer ScienceParallel Computer Architectures
- 1993

This algorithm is suited for large data volumes (external sorting) and does not suffer from processing skew in presence of data skew and the optimal degree of CPU parallelism is derived if I/O limitations are taken into account.

Parallel Sorting of Large Data Volumes on Distributed Memory Multiprocessors

- Computer Science
- 1993

This algorithm is suited for large data volumes (external sorting) and does not suffer from processing skew in presence of data skew and the optimal degree of CPU parallelism is derived if I/O limitations are taken into account.

On the design, implementation, and evaluation of a portable parallel database system

- Computer ScienceProceedings. PARBASE-90: International Conference on Databases, Parallel Architectures, and Their Applications
- 1990

A portable parallel database system that exploits both parallel algorithms and data parallelism to expedite database processing is described and it is shown that, for joins with a comparable number of tuples in each of the two joining relations, a bucket-based approach is preferable.

External Sorting for Databases in Distributed Heterogeneous Systems

- Computer Science
- 1993

This paper describes a new, load{balanced external parallel sorting method which is more robust to data skew and to variable speed of processes and compares the run time of the new method with an analogous conventional method in case ofData skew and load imbalances.

Duplicate removal on hypercube engines: an experimental analysis

- Computer ScienceParallel Comput.
- 1991

Experimentation with hypercube database engines

- Computer ScienceIEEE Micro
- 1992

Using Intel's iPSC/2 hypercube, the authors measured the relationship between packet size, method of clustering messages, and internode traffic on the total sustained communication bandwidth and analyzed duplicate removal algorithms.

Parallel sorting on a shared-nothing architecture using probabilistic splitting

- Computer Science[1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems
- 1991

The authors consider the problem of external sorting in a shared-nothing multiprocessor with two techniques for determining ranges of sort keys: exact splitting, using a parallel version of the algorithm proposed by Iyer, Ricard, and Varman; and probabilistic splitting, which uses sampling to estimate quantiles.

Relational Algebra Operations

- Computer SciencePRISMA Workshop
- 1990

A set of relational algebra operations are described and slightly enhanced for improvediciency and the results of the DeWitt join test runs are given.

## References

SHOWING 1-9 OF 9 REFERENCES

Parallel Partition Sort for Database Machines

- Computer ScienceIWDM
- 1987

A new parallel sorting method, called a parallel partition sort, which transfers only a small amount of data and does not place large demands on the CPU is discussed, based on the top-down partitioning of data.

Algebra Operations on a Parallel Computer - Performance Evaluation

- Computer ScienceIWDM
- 1987

The design of a parallel database computer that contains 8 single board computers that communicate over a system of shared RAM, allowing fast communication without interference, and test results are reported.

Data Structures and Algorithms

- Computer Science
- 1983

The basis of this book is the material contained in the first six chapters of the earlier work, The Design and Analysis of Computer Algorithms, and has added material on algorithms for external storage and memory management.

Join on a Cube: Analysis, Simulation, and Implementation

- Computer ScienceIWDM
- 1987

This paper discusses one part of the work, viz., the study of the join operation, where novel data redistribution operations are employed to improve the performance of the various database operations including join.

Multiprocessor Hash-Based Join Algorithms

- Computer ScienceVLDB
- 1985

It is demonstrated that bit vector filtering provides dramatic improvement in the performance of all algorithms including the sort mergejoin algorithm, and is shown to provide linear increases in throughput with corresponding increases in processor and disk resources.

Sorting and Searching

- Computer Science
- 1973

The first revision of this third volume is a survey of classical computer techniques for sorting and searching. It extends the treatment of data structures in Volume 1 to consider both large and…

Hashing Methods and Relational Algebra Operations

- Computer ScienceVLDB
- 1984

The relational algebra operatrons described in this paper are under implementation in TECHRA (TECHBC), a database system especially designed to meet the needs of technical applications, like CAD systems, utility maps, oil field exploration, etc.

A Neighbor Connected Processor Network for Performing Relational Algebra Operations

- Computer ScienceCAW '80
- 1980

The capacity of the communication network have been analyzed under the workload of relational algebra operations and each of 2 or 3 cells have been found to give the highest processing capacity per cell in the network.

Binsorting on hypercubes with d-port communication

- Computer ScienceC3P
- 1989

Three sorting algorithms are given for hypercubes with d-port communication based on binsort at the global level to reduce communication costs and reduce the variance among the lengths of the subsequences left in the nodes after the complete exchange of bins.