• Corpus ID: 16949526

Concise : Co mpressed ‘ n ’ C omposable I nteger Se t

  title={Concise : Co mpressed ‘ n ’ C omposable I nteger Se t},
  author={Alessandro Colantonio and Roberto Di Pietro},
a r t i c l e i n f o a b s t r a c t Bit arrays, or bitmaps, are used to significantly speed up set operations in several areas, such as data warehousing, information retrieval, and data mining, to cite a few. However, bitmaps usually use a large storage space, thus requiring compression. Consequently, there is a space–time tradeoff among compression schemes. The Word Aligned Hybrid (WAH) bitmap compression trades some space to allow for bitwise operations without first decompressing bitmaps… 

Figures and Tables from this paper



Optimizing Frequency Queries for Data Mining Applications

  • H. MalikJ. Kender
  • Computer Science
    Seventh IEEE International Conference on Data Mining (ICDM 2007)
  • 2007
This paper compares the memory requirements and support counting performance of FP Tree, and Compressed Patricia Trie against several novel variants of vertical bit vectors, and proposes a novel Hamming-distance-based greedy transaction reordering scheme, and aHDO, a linear-time approximation to HDO.

Position list word aligned hybrid: optimizing space and performance for compressed bitmaps

The Position List Word Aligned Hybrid (PLWAH) compression scheme is presented, that improves significantly over WAH compression by better utilizing the available bits and new CPU instructions.

Byte-aligned bitmap compression

  • G. Antoshenkov
  • Computer Science
    Proceedings DCC '95 Data Compression Conference
  • 1995
The proposed byte-aligned bitmap compression method (BBC) aims to support fast set operations on the compressed bitmap formats and, at the same time, to retain a competitive compression rate.

Optimizing bitmap indices with efficient compression

This article presents a new compression scheme called Word-Aligned Hybrid (WAH) code that makes compressed bitmap indices efficient even for high-cardinality attributes and proves that the new compressed bit map index, like the best variants of the B-tree index, is optimal for one-dimensional range queries.

Analyses of multi-level and multi-component compressed bitmap indexes

This analysis is the first to fully incorporate the effects of compression on the performance of well-known bitmap indexes, and investigates a number of novel variations in a class of multi-level indexes, finding that they answer queries faster than the best ofmulti-component indexes.

FastBit: interactively searching massive data

A summary of the key underlying technologies, namely bitmap compression, encoding, and binning, that enable FastBit to answer structured (SQL) queries orders of magnitude faster than popular database systems are presented.

Design of the Java HotSpot#8482; client compiler for Java 6

The new architecture of the client compiler is outlined and how it interacts with the VM is shown, including the intermediate representation that now uses static single-assignment (SSA) form and the linear scan algorithm for global register allocation.

Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator

A new algorithm called Mersenne Twister (MT) is proposed for generating uniform pseudorandom numbers, which provides a super astronomical period of 2 and 623-dimensional equidistribution up to 32-bit accuracy, while using a working area of only 624 words.

DEFLATE Compressed Data Format Specification version 1.3

This specification defines a lossless compressed data format that compresses data using a combination of the LZ77 algorithm and Huffman coding, with efficiency comparable to the best currently

Mersenne Twister A Pseudo-Random Number Generator

This paper describes the theories and algorithm of the Mersenne Twister, and briefly introduces its variants and discusses its limitations and improvements.