This paper describes a customized database and a comprehensive set of queries that can be used for systematic benchmarking of relational database systems. Designing this database and a set of carefully tuned benchmarks represents a first attempt in developing a scientific methodology for performance evaluation of database management systems. We have used… (More)
Disk shadowing is a technique for maintaining a set of two or more identical disk images on separate disk devices. Its primary purpose is to enhance reliability and availability of secondary storage by providing multiple paths to redundant data. However, shadowing can also boost UO performance. In this paper, we contend that intelligent device scheduling of… (More)
The issue of duplicate elimination for large data files in which many occurrences of the same record may appear is addressed. A comprehensive cost analysis of the duplicate elimination operation is presented. This analysis is based on a combinatorial model developed for estimating the size of intermediate runs produced by a modified merge-sort procedure.… (More)
The goal of EII systems is to provide uniform access to multiple data sources without having to first load them into a data warehouse. Since the late 1990's, several EII products have appeared in the marketplace and significant experience has been accumulated from fielding such systems. This collection of articles, by individuals who were involved in this… (More)
This paper presents and analyzes algorithms for parallel processing of relational database operations in a general multiprocessor framework. To analyze alternative algorithms, we introduce an analysis methodology which incorporates I/O, CPU, and message costs and which can be adjusted to fit different multiprocessor architectures. Algorithms are presented… (More)
We propose a taxonomy of parallel sorting that encompasses a broad range of array-and file-sorting algorithms. We analyze how research on parallel sorting has evolved, from the earliest sorting networks to shared memory algorithms and VLSI sorters. In the context of sorting networks, we describe two fundamental parallel merging schemes: the odd-even and the… (More)
Optical disks are among the most promising secondary storage devices for data-intensive applications and database management systems. A means of optimizing the storage capacity of optical disks is presented here.
In 1983, the first commercially available database machines were becoming available and new prototypes, such as the GRACE Database Machine [Kit83], had been proposed. The Wisconsin Benchmark [BDT83] was created in an attempt to compare backend database machine architectures to software relational database systems running on a general purpose computer.… (More)