# Code and parse trees for lossless source encoding

@article{Abrahams1997CodeAP,
title={Code and parse trees for lossless source encoding},
author={Julia Abrahams},
journal={Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171)},
year={1997},
pages={145-171}
}
• J. Abrahams
• Published 11 June 1997
• Computer Science
• Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171)
This paper surveys the theoretical literature on fixed-to-variable-length lossless source code trees, called code trees, and on variable-length-to-fixed lossless source code trees, called parse trees. In particular, the following code tree topics are outlined in this survey: characteristics of the Huffman (1952) code tree; Huffman-type coding for infinite source alphabets and universal coding; the Huffman problem subject to a lexicographic constraint, or the Hu-Tucker (1982) problem; the…
105 Citations
New bounds on D-ary optimal codes
• Computer Science
Inf. Process. Lett.
• 2005
Average Redundancy for Known Sources: Ubiquitous Trees in Source Coding
This survey concentrates on one facet of information theory (i.e., source coding better known as data compression), namely the redundancy rate problem, and investigates average redundancy of Huffman, Tunstall, and Khodak codes.
Lattice Path Counting 319 Tail Bounds for the Wiener Index of Random Trees
• Mathematics
• 2008
We study a random walk with positive drift in the first quadrant of the plane. For a given connected region C of the first quadrant, we analyze the number of paths contained in C and the first exit
On the Exit Time of a Random Walk with Positive Drift
• Mathematics
• 2007
We study a random walk with positive drift in the first quadrant of the plane. For a given connected region $\mathcal{C}$ of the first quadrant, we analyze the number of paths contained in
Optimal Parsing Trees for Run-Length Coding of Biased Data
• Computer Science
IEEE Transactions on Information Theory
• 2006
Here, the main result is that one can use the Tunstall source coding algorithm to generate optimal codes for a partial class of (d, k) constraints.
More Efficient Algorithms and Analyses for Unequal Letter Cost Prefix-Free Coding
• Computer Science
IEEE Trans. Inf. Theory
• 2008
This paper provides a new algorithm that works with infinite encoding alphabets and often provides better error bounds than the best previous ones known, when restricted to the finite alphabet case.
Optimal alphabet for single text compression
• Computer Science, Mathematics
• 2022
The optimal noiseless compression of texts using the Huffman code is studied, where the alphabet of encoding coincides with one of those representations, and it is shown that it is necessary to account for the codebook when compressing a single text.
A random code generation method based on syntax tree layering model
• Computer Science
International Conference on Electronics and Information Engineering
• 2021
Experiments from the three dimensions of code complexity, control flow and semantic similarity prove that this method can generate a large number of source code randomly, and the generated code has low similarity in control flow diagram and semantics.
Computer Science – Theory and Applications: 15th International Computer Science Symposium in Russia, CSR 2020, Yekaterinburg, Russia, June 29 – July 3, 2020, Proceedings
• Computer Science
CSR
• 2020
This paper investigates the pre-image resistance of this function and shows that it reveals only O(1) bits of information about the input, and discusses cryptographic properties of quantum hashing.
Optimal Skeleton Huffman Trees Revisited
• Computer Science
CSR
• 2020
An O(n^2\log n) time algorithm that, given $n$ symbol frequencies, constructs an optimal skeleton tree and its corresponding optimal code.

## References

SHOWING 1-10 OF 226 REFERENCES
A Dynamic Programming Algorithm for Constructing Optimal Prefix-Free Codes with Unequal Letter Costs
• Computer Science
IEEE Trans. Inf. Theory
• 1998
This work considers the problem of constructing prefix-free codes of minimum cost when the encoding alphabet contains letters of unequal length and introduces a new dynamic programming solution that optimally encodes n words in O(n/sup C+2/) time.
Conditions for Optimality of the Huffman Algorithm
A new general formulation of Huffman tree construction is presented which has broad application and a wide class of weight combination functions, the quasilinear functions, for which the Huffman algorithm produces optimal trees under correspondingly wide classes of cost criteria are characterized.
Nonexhaustive Generalized Fibonacci Trees in Unequal Costs Coding Problems
Each tree in the generalized Fibonacci sequence solves a minimax coding problem related to Varn coding, where each symbol from a uniformly distributed source is to be encoded by a string of code symbolsassociated with the path through the tree from the root to the leaf associated with the source symbol.
A new bound for the data expansion of Huffman codes
• Computer Science
IEEE Trans. Inf. Theory
• 1997
It is proved that the maximum data expansion of Huffman codes is upper-bounded by /spl delta/<1.39, which improves on the previous best known upper bound /splDelta/<2.39.
Existence of optimal prefix codes for infinite source alphabets
• Computer Science
IEEE Trans. Inf. Theory
• 1997
It is proven that for every random variable with a countably infinite set of outcomes and finite entropy there exists an optimal prefix code which can be constructed from Huffman codes for truncated
Optimal algorithms for inserting a random element into a random heap
Two algorithms for inserting a random element into a random heap are shown to be optimal (in the sense that they use the least number of comparisons on the average among all comparison-based
Optimal prefix codes for two-sided geometric distributions
• Computer Science
Proceedings of IEEE International Symposium on Information Theory
• 1997
A complete characterization of optimal prefix codes is presented for off-centered, two-sided geometric distributions of the integers, often encountered in lossless image compression applications, as probabilistic models for image prediction residuals.
The Kolmogorov complexity, universal distribution, and coding theorem for generalized length functions
The usefulness is defined for the expressions that define the Kolmogorov complexity and the universal distribution, and the relation among the usefulness for the coding theorem, that for the KolMogorova complexity, and that forThe universal distribution is considered.
The synchronization of variable-length codes
A novel method for estimating the synchronization performance for a wide variety of variable-length codes, based on the T-Codes, which typically synchronize within 2-3 codewords by a mechanism that derives from a recursive T-augmentation construction.
Universal coding of integers and unbounded search trees
• Computer Science
IEEE Trans. Inf. Theory
• 1997
The modified log-star function is introduced to reveal the existance of better prefix codes than the Elias omega code and other known codes, including the Bentley-Yao search tree.