Tuning a parallel database algorithm on a shared‐memory multiprocessor
@article{Graefe1992TuningAP, title={Tuning a parallel database algorithm on a shared‐memory multiprocessor}, author={Goetz Graefe and Shreekant S. Thakkar}, journal={Software: Practice and Experience}, year={1992}, volume={22} }
Database query processing can benefit significantly from parallelism. Parallel database algorithms combine substantial CPU and I/O activity, memory requirements, and massive data exchange between processes, all of which must be considered to obtain optimal performance. Since parallel external sorting is a very typical example, we have focused on sorting to tune Volcano, a new query processing system. The purpose of the Volcano project is to provide efficient, extensible tools for query and…
21 Citations
Query evaluation techniques for large databases
- Computer ScienceCSUR
- 1993
This survey describes a wide array of practical query evaluation techniques for both relational and postrelational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set-matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains.
Volcano - An Extensible and Parallel Query Evaluation System
- Computer ScienceIEEE Trans. Knowl. Data Eng.
- 1994
Volcano is the first implemented query execution engine that effectively combines extensibility and parallelism, and is extensible with new operators, algorithms, data types, and type-specific methods.
Encapsulation of Parallelism and Architecture-Independence in Extensible Database Query Execution
- Computer ScienceIEEE Trans. Software Eng.
- 1993
The authors justify their decision to support hierarchical architectures and argue that the exchange operator offers a significant advantage for development and maintenance of database query processing software.
Alphasort: A cache-sensitive parallel external sort
- Computer ScienceThe VLDB Journal
- 2005
A new sort algorithm, called AlphaSort, demonstrates that commodity processors and disks can handle commercial batch workloads and argues that modern architectures require algorithm designers to re-examine their use of the memory hierarchy.
Sort vs . Hash Revisited
- Computer Science
- 2004
This article compares the concepts behind sortand hash-based queryprocessing algorithms and concludes that many dualities exist between the two types of algorithms and there is a strong reason why both hashand sort-based algorithms should be available in a query-processing system.
Domain-Partitioned Parallel Sort-Merge Join
- Computer Science
- 1995
It is concluded that parallel sort-merge join is inferior to hash-based join algorithms unless the joining relations are already sorted.
CHAPTER 1-INTRODUCTION
- Computer Science
- 1999
There has been a continuing increase in the amount of data handled by database management systems (DBMSs) in recent years, with a growing need for DBMSs to exhibit more sophisticated functionality such as the support of object-oriented, deductive, and multimediabased applications.
AlphaSort: a RISC machine sort
- Computer ScienceSIGMOD '94
- 1994
A new sort algorithm, called AlphaSort, demonstrates that commodity processors and disks can handle commercial batch workloads and proposes two new benchmarks: Minutesort: how much can you sort in a minute, and DollarSort: how to sort for a dollar.
Sort versus Hash Revisited
- Computer ScienceIEEE Trans. Knowl. Data Eng.
- 1994
This article compares the concepts behind sort- and hash-based query-processing algorithms and concludes that there is a strong reason why both hash- and sort-based algorithms should be available in a query- processing system.
Adaptive Parallel Query Execution in DBS3
- Computer ScienceEDBT
- 1996
DBS3, a shared-memory database system implemented on a 72-node KSR1 multiprocessor, is described, which addresses problems of start-up time of parallel operations, interference and poor load balancing among the processors due to skewed data distribution.
References
SHOWING 1-10 OF 52 REFERENCES
Prototyping Bubba, A Highly Parallel Database System
- Computer ScienceIEEE Trans. Knowl. Data Eng.
- 1990
The current Bubba prototype runs on a commercial 40-node multicomputer and includes a parallelizing compiler, distributed transaction management, object management, and a customized version of Unix.
Join processing in database systems with large main memories
- Computer ScienceTODS
- 1986
A new algorithm is presented which is a hybrid of two hash-based algorithms and which dominates the other algorithms presented, including sort-merge, which even in a virtual memory environment, the hybrid algorithm dominates all the others.
Design, analysis, and implementation of parallel external sorting algorithms
- Computer Science
- 1981
A modified merge-sort is proposed to use as a method for eliminating duplicate records in a large file and a combinatorial model is developed to provide an accurate estimate for the cost of the duplicate elimination operation (both in the serial and the parallel cases).
Sampling Issues in Parallel Database Systems
- Computer ScienceEDBT
- 1992
This paper proves that for query size estimation, stratified random sampling guarantees perfect load balancing without reducing the accuracy of the estimate, and that for a given number of I/O operations, page level sampling always produces a more accurate estimate than tuple level sampling.
A Low Communication Sort Algorithm for a Parallel Database Machine
- Computer ScienceVLDB
- 1989
This work proposes a novel algorithm that exhibits complete parallelism during the sort, merge, and return-tohost phases, and decreases the amou@ of inter-processor communication compared to existing parallel sort algorithms.
Encapsulation of parallelism in the Volcano query processing system
- Computer ScienceSIGMOD '90
- 1990
The reasons for not choosing the bracket model, the novel operator model, and details of Volcano's exchange operator that parallelizes all other operators are described, which makes implementation of parallel database algorithms significantly easier and more robust.
The Gamma Database Machine Project
- Computer ScienceIEEE Trans. Knowl. Data Eng.
- 1990
The design of the Gamma database machine and the techniques employed in its implementation are described and a thorough performance evaluation of the iPSC/s hypercube version of Gamma is presented.
A taxonomy of parallel sorting
- Computer ScienceCSUR
- 1984
This paper analyzes the evolution of research on parallel sorting, from the earliest sorting networks to the shared memory algorithms and the VLSI sorters, and proposes a taxonomy of parallel sorting that includes a broad range of array and file sorting algorithms.
A Study of Sort Algorithms for Multiprocessor Database Machines
- Computer ScienceVLDB
- 1986
This paper proposes a new algorithm called the modified block bitonic sort, which is the fastest of the algorithms over a wide range of values of interest to us, and presents the results of analyzing these different parallel external sorting algorithms.
Sort versus Hash Revisited
- Computer ScienceIEEE Trans. Knowl. Data Eng.
- 1994
This article compares the concepts behind sort- and hash-based query-processing algorithms and concludes that there is a strong reason why both hash- and sort-based algorithms should be available in a query- processing system.