Improving Bitmap Index Compression by Data Reorganization


The volume of data generated by scientific applications through observations or computer simulations can reach to the order of the petabytes. This brings up the need for effective and compact indexing methods for efficient storage and retrieval of scientific data. Bitmap indexing has been successfully applied in this domain by exploiting the fact that scientific data are mostly read-only and enumerated or numerical. Bitmap indices can be compressed for efficient storage. In this paper, we study how to reorganize bitmap tables for improved compression rates. Our algorithms are used as a preprocessing step, thus there is no need to revise the current indexing techniques and the query processing algorithms. We propose Gray code ordering algorithm for this NP-Complete problem, which is an in-place algorithm, and runs in linear time in the order of the size of the database. We explore the effect of the order in which columns are evaluated in the Gray code ordering, to further improve the query execution time. Our experimental results on real data sets show that the compression ratio can be improved by a factor of 2 to 10 and the query execution times by a factor of 4 to 7.

Cite this paper

@inproceedings{Canahuate2006ImprovingBI, title={Improving Bitmap Index Compression by Data Reorganization}, author={Guadalupe Canahuate and Hakan Ferhatosmanoglu}, year={2006} }