A parallel Huffman coder on the CUDA architecture


We present a parallel implementation of the widely-used entropy encoding algorithm, the Huffman coder, on the NVIDIA CUDA architecture. After constructing the Huffman codeword tree serially, we proceed in parallel by generating a byte stream where each byte represents a single bit of the compressed output stream. The final step is then to combine each consecutive 8 bytes into a single byte in parallel to generate the final compressed output bit stream. Experimental results show that we can achieve up to 22× speedups compared to the serial CPU implementation without any constraint on the maximum codeword length or data entropy.

DOI: 10.1109/VCIP.2014.7051566

4 Figures and Tables

Cite this paper

@article{Rahmani2014APH, title={A parallel Huffman coder on the CUDA architecture}, author={Habibelahi Rahmani and Cihan Topal and Cuneyt Akinlar}, journal={2014 IEEE Visual Communications and Image Processing Conference}, year={2014}, pages={311-314} }