Data Compression Considering Text Files

Abstract

Lossless text data compression is an important field as it significantly reduces storage requirement and communication cost. In this work, the focus is directed mainly to different file compression coding techniques and comparisons between them. Some memory efficient encoding schemes are analyzed and implemented in this work. They are: Shannon Fano Coding, Huffman Coding, Repeated Huffman Coding and Run-Length coding. A new algorithm “Modified Run-Length Coding” is also proposed and compared with the other algorithms. These analyses show how these coding techniques work, how much compression is possible for these coding techniques, the amount of memory needed for each technique, comparison between these techniques to find out which technique is better in what conditions. It is observed from the experiments that the repeated Huffman Coding shows higher compression ratio. Besides, the proposed Modified run length coding shows a higher performance than the conventional one.

Extracted Key Phrases

5 Figures and Tables

Cite this paper

@inproceedings{Sailunaz2014DataCC, title={Data Compression Considering Text Files}, author={Kashfia Sailunaz and Mohammed Rokibul Alam and Mohammad Nurul Huda}, year={2014} }