# Implementing data cubes efficiently

@inproceedings{Harinarayan1996ImplementingDC, title={Implementing data cubes efficiently}, author={Venky Harinarayan and Anand Rajaraman and Jeffrey D. Ullman}, booktitle={SIGMOD '96}, year={1996} }

Decision support applications involve complex queries on very large databases. Since response times should be small, query optimization is critical. Users typically view the data as multidimensional data cubes. Each cell of the data cube is a view consisting of an aggregation of interest, like total sales. The values of many of these cells are dependent on the values of other cells in the data cube. A common and powerful query optimization technique is to materialize some or all of these cells…

## Figures and Topics from this paper

## 1,514 Citations

Compressed data cubes for OLAP aggregate query approximation on continuous dimensions

- Computer ScienceKDD '99
- 1999

A new compressed representation of the data cube is proposed that drastically reduces storage requirements, does not require the discretization hierarchy along each query dimension to be fixed beforehand and treats each dimension as a potential target measure and supports multiple aggregation functions without additional storage costs.

Optimizing multiple dimensional queries simultaneously in multidimensional databases

- Computer ScienceThe VLDB Journal
- 2000

This paper considers in detail two cases of the problem in which all the queries are either hash- based star joins or index-based star joins only and presents the only development of polynomial algorithms for the first two cases which are able to deliver plans with deterministic performance guarantees in terms of the qualities of the plans generated.

Answering multidimensional queries on cubes using other cubes

- Computer ScienceProceedings. 12th International Conference on Scientific and Statistica Database Management
- 2000

This paper provides a simple data model for MD databases, and a simple algebraic MD query language that permit the modeling of the principal OLAP operations, and provides instance independent expressions that compute an MD query on a cube from derived cubes.

Range queries in dynamic OLAP data cubes

- Computer ScienceData Knowl. Eng.
- 2000

A new algorithm is provided which achieves constant time per range sum query while constraining each update cost within O(nd/2), where d is the number of dimensions of the data cube and n is thenumber of distinct values of the domain at each dimension.

Efficient Materialized View Selection for Multi-Dimensional Data Cube Models

- Mathematics, Computer ScienceInt. J. Inf. Retr. Res.
- 2016

The authors in this paper present a refined greedy selection approach using forward references to give better materialized view selection that works on lattice framework of data that is capable enough to show inter dependencies of data.

An Optimization Problem in Data Cube System Design

- Computer SciencePAKDD
- 2000

Approximate algorithms Greedy Removing and 2-Greedy Merging are proposed and they show that their approach is both effective and efficient in the data cube system design.

Efficient Evaluation of Sparse Data Cubes

- Computer ScienceWAIM
- 2004

A new dynamic data structure called SST (Sparse Statistics Trees) and a novel, interactive, and fast cube evaluation algorithm called CUPS (Cubing by Pruning SST), which is especially well suitable for computing aggregates in cubes whose data sets are sparse.

Cost effective storage space for data cubes

- Computer Science
- 2017

The relation between the number of data cube views and the space limit expressed as a percentage of the fully materialized data cube size and a multiple of the base view size is analysed and it is found that the allocation of large space for views materialization is not cost effective.

Cost effective storage space for data cubes

- Computer ScienceJ. Intell. Inf. Syst.
- 2017

The relation between the number of data cube views and the space limit expressed as a percentage of the fully materialized data cube size and a multiple of the base view size is analysed and it is found that the allocation of large space for views materialization is not cost effective.

Optimization in Data Cube System Design

- Computer ScienceJournal of Intelligent Information Systems
- 2004

The design of an OLAP system for supporting real-time queries is one of the major research issues. One approach is to use data cubes, which are materialized precomputed multidimensional views of data…

## References

SHOWING 1-10 OF 17 REFERENCES

Index selection for OLAP

- Computer ScienceProceedings 13th International Conference on Data Engineering
- 1997

The authors give algorithms that automate the selection of summary tables and indexes, and present a family of algorithms of increasing time complexities, and prove strong performance bounds for them.

Including Group-By in Query Optimization

- Computer ScienceVLDB
- 1994

It is shown that the extent of improvement in the quality of plans is significant with only a modest increase in optimization cost, and the technique also applies to optimization of Select Distinct queries by pushing down duplicate elimination in a cost-based fashion.

Query evaluation techniques for large databases

- Computer ScienceCSUR
- 1993

This survey describes a wide array of practical query evaluation techniques for both relational and postrelational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set-matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains.

Aggregate-Query Processing in Data Warehousing Environments

- Computer ScienceVLDB
- 1995

Generalized projections are introduced, that capture aggregations, groupbys, duplicate-eliminating projections (distinct and duplicate-preserving projections in a common unified framework), and powerful query rewrite rules for aggregate queries are developed that unify and extend rewrite rules previously known in the literature.

Multi-table joins through bitmapped join indices

- Computer ScienceSGMD
- 1995

This technical note shows how to combine some well-known techniques to create a method that will efficiently execute common multi-table joins, and outlines realistic examples where the combination of these techniques yields substantial performance improvements over alternative, more traditional query evaluation plans.

Sampling-Based Estimation of the Number of Distinct Values of an Attribute

- Computer ScienceVLDB
- 1995

This appears to be the first extensive comparison of distinct-value estimators in either the database or statistical literature, and is certainly the first to use highlyskewed data of the sort frequently encountered in database applications.

A threshold of ln n for approximating set cover (preliminary version)

- Mathematics, Computer ScienceSTOC '96
- 1996

We prove that (] – o(]))lnn is a threshold below which set, cover cannot be approximated efficiently, unless NP has slightly superpolynornial time algorithms. This closes tlw gap (up to low order…

A threshold of ln n for approximating set cover

- Mathematics, Computer ScienceJACM
- 1998

It is proved that (1 - <?Pub Fmt italic>o<?Pub FMT /italic>(1) ln n setcover is a threshold below which setcover cannot be approximated efficiently, unless NP has slightlysuperpolynomial time algorithms.

Cheklu-i

- Cheklu-i
- 1996

Ulhnan. Index Selection for OLAP. SIIhmitted for publication, At http://clb. Stanford. eclu/ pub/hgupta/1996 /CubeIndex.ps

- 1996