Parallel algorithms for the execution of relational database operations

@article{Bitton1983ParallelAF,
  title={Parallel algorithms for the execution of relational database operations},
  author={Dina Bitton and Haran Boral and David J. DeWitt and Kevin Wilkinson},
  journal={ACM Trans. Database Syst.},
  year={1983},
  volume={8},
  pages={324-353}
}
This paper presents and analyzes algorithms for parallel processing of relational database operations in a general multiprocessor framework. To analyze alternative algorithms, we introduce an analysis methodology which incorporates I/O, CPU, and message costs and which can be adjusted to fit different multiprocessor architectures. Algorithms are presented and analyzed for sorting, projection, and join operations. While some of these algorithms have been presented and analyzed previously, we… 

Figures from this paper

A Study of Sort Algorithms for Multiprocessor Database Machines
TLDR
This paper proposes a new algorithm called the modified block bitonic sort, which is the fastest of the algorithms over a wide range of values of interest to us, and presents the results of analyzing these different parallel external sorting algorithms.
A Technique for Analyzing Query Execution in a Multiprocessor Database Machine
TLDR
A methodology for representing and evaluating the execution of relational queries by a multiprocessor data base machine (DBM) and a procedure for computing the execution cost of the query is given.
Response Time Analysis of Multiprocessor Computers for Database Support
Comparison of three multiprocessor computer architectures for database support is made possible through evaluation of response time expressions. These expressions are derived by parameterizing
Parallel Algorithms for the Execution of Relational Database Operations Revisited On Grids
TLDR
It is shown that an expressive model can be built upon just three characteristic parameter sets, namely the node processing performance and the network and the disk bandwidths, and that using smart enhancement to exploit the heterogeneity of the grid, the performance of the algorithms for database operations can be increased remarkably.
The Join Alogorithms on a Shared-Memory Multiprocessor Database Machine
TLDR
This study shows, among other things, that for a given hardware configuration there is not just one overall best performing join algorithm, but rather different algorithms score the best performance, depending on the characteristics of the data participating in the join operation.
Effective skew handling for parallel sorting in multiprocessor database systems
  • Yu-lung Lo, Yu-chen Huang
  • Computer Science
    Ninth International Conference on Parallel and Distributed Systems, 2002. Proceedings.
  • 2002
TLDR
This work presents two parallel sorting algorithms using the dynamic load balancing technique to address the data skew problem and indicates that the proposed parallel sorting techniques can provide very impressive performance improvement over conventional approaches.
Using shared virtual memory for parallel join processing
In this paper, we show that shared virtual memory, in a shared-nothing multiprocessor, facilitates the design and implementation of parallel join processing algorithms that perform significantly
A Technique for Analyzing Query Execution in A
TLDR
A methodology for representing and evaluating the execution of relational queries by a multiprocessor data base machine (DBM) and a procedure for computing the execution cost of the query is given.
A Parallel Logging Algorithm for Multiprocessor Database Machine
TLDR
A recovery architecture based on parallel logging for the multiprocessor-cache class of database machines and the results of the evaluation of its impact on the per forinance of the database machine are presented.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 23 REFERENCES
Direct—A Multiprocessor Organization for Supporting Relational Database Management Systems
The design of DIRECT, a multiprocessor organization for supporting relational database management systems is presented. DIRECT has a multiple-instruction multiple-data stream (MIMD) architecture. It
Design considerations for data-flow database machines
TLDR
The performance of multiprocessor nested-loops and sort-merge join algorithms is analyzed and it is shown that the nested-Loops algorithm is generally superior and the third level of granularity, a page of a relation, is shown to be the best choice from both hardware and software viewpoints.
Design, analysis, and implementation of parallel external sorting algorithms
TLDR
A modified merge-sort is proposed to use as a method for eliminating duplicate records in a large file and a combinatorial model is developed to provide an accurate estimate for the cost of the duplicate elimination operation (both in the serial and the parallel cases).
A PERFORMANCE EVALUATION OF DATABASE MACHINE ARCHITECTURES
TLDR
It is demonstrated that no one type of database machine is best for executing all types of queries and that for several classes of queries certain database machine designs which have been proposed are actually slower than a DBMS on a conventional processor.
Optimal Sorting Algorithms for Parallel Computers
TLDR
The problem of sorting a sequence of n elements on a parallel computer with k processors is considered and each achieves an asymptotic speed-up ratio of k with respect to the best sequential algorithm, which is optimal in the number of processors used.
RAP: an associative processor for data base management
TLDR
Pointer mechanisms for mapping structures and providing fast access paths have to be implemented by software and data for efficient search mechanisms to handle large data bases within concurrent processing and on-line response limits.
Sorting networks and their applications
To achieve high throughput rates today's computers perform several operations simultaneously. Not only are I/O operations performed concurrently with computing, but also, in multiprocessors, several
Implementing a relational database by means of specialzed hardware
New hardware is described which allows the rapid execution of queries demanding the joining of physically stored relations. The main feature of the hardware is a special store which can rapidly
CASSM: a cellular system for very large data bases
TLDR
The application view of CASSM, a Clontext Addressed Segment Sequential Memory implemented on a head-per-track disc and an array of non-numeric microprocessors, offers potential solutions to several large data base problems.
The design of a rotating associative memory for relational database applications
TLDR
RARES is designed to enhance the performance of an optimizing relational query interface by supporting important high level optimization techniques and can perform tuple selection operations at the storage device and also can provide a mechanism for efficient sorting.
...
1
2
3
...