• Publications
  • Influence
Improved histograms for selectivity estimation of range predicates
TLDR
We provide a taxonomy of histograms that captures all previously proposed histogram types and indicates many new possibilities. Expand
Selectivity Estimation Without the Attribute Value Independence Assumption
TLDR
We propose two main alternatives to effectively approximate (multi-dimensional) joint data distributions. (a) Using a multi-dimensional histogram, (b) Using the Singular Value Decomposition (SVD) technique from linear algebra. Expand
Bitmap index design and evaluation
TLDR
We present a general framework to study the design space of bitmap indexes for selection queries and examine the disk-space and time characteristics that the various alternative index choices offer. Expand
Randomized algorithms for optimizing large join queries
TLDR
We propose a new Two Phase Optimization algorithm, which is a combination of Simulated Annealing and Iterative Improvement, for project-select-join queries. Expand
The History of Histograms (abridged)
TLDR
The history of histograms is long and rich, full of detailed information in every step. Expand
An efficient bitmap encoding scheme for selection queries
TLDR
In this paper, we establish a number of optimality results for the existing bitmap encoding schemes; in particular, we prove that neither of the two known schemes is optimal for the class of two-sided range queries. Expand
BioMagResBank
TLDR
The BioMagResBank (BMRB: www.bmrb.wisc.edu) is a repository for experimental and derived data gathered from nuclear magnetic resonance (NMR) spectroscopic studies of biological molecules. Expand
Histogram-Based Approximation of Set-Valued Query-Answers
TLDR
A method for generating an approximate answer in response to a query to a database in which an SQL query Q for operating on a relation R in a database is received. Expand
Containment of conjunctive queries: beyond relations as sets
TLDR
Conjwzctiue queries are queries over a relational database and are at the core of relational query languages such as SQL. Expand
Schedule optimization for data processing flows on the cloud
TLDR
In this paper, we study scheduling of dataflows that involve arbitrary data processing operators in the context of three different problems: 1) minimize completion time given a fixed budget, 2) minimize monetary cost given a deadline, and 3) find trade-offs between completion time and monetary cost without any a-priori constraints. Expand
...
1
2
3
4
5
...