• Publications
  • Influence
Data Compression Using Adaptive Coding and Partial String Matching
This paper describes how the conflict can be resolved with partial string matching, and reports experimental results which show that mixed-case English text can be coded in as little as 2.2 bits/ character with no prior knowledge of the source. Expand
Arithmetic coding for data compression
The state of the art in data compression is arithmetic coding, not the better-known Huffman method. Arithmetic coding gives greater compression, is faster for adaptive models, and clearly separatesExpand
Text Compression
K*: An Instance-based Learner Using and Entropic Distance Measure
K*, an instance-based learner that uses entropy as a distance measure, is described, and results that compare favourably with several machine learning algorithms are presented. Expand
Unbounded Length Contexts for PPM
A variant of the PPM algorithm is described, called PPM*, which exploits contexts of unbounded length, and although requiring considerably greater computational resources, this reliably achieves compression superior to the benchmark PPMC version. Expand
\self-organized Language Modeling for Speech Recognition". In
\A new data structure for cumulative probability tables". Soft-\The zero-frequency problem: estimating the probabilities of novel events in adaptive text compression".
Modeling for text compression
This paper surveys successful strategies for adaptive modeling that are suitable for use in practical text compression systems, and falls into three main classes: finite-context modeling, in which the last few characters are used to condition the probability distribution for the next one. Expand
Unbounded length contexts for PPM
  • J. Cleary, W. Teahan
  • Computer Science
  • Proceedings DCC '95 Data Compression Conference
  • 28 March 1995
A new algorithm is described, PPM*, which exploits contexts of unbounded length and reliably achieves compression superior to PPMC, although the current implementation uses considerably greater computational resources (both time and space). Expand
The entropy of English using PPM-based models
  • W. Teahan, J. Cleary
  • Computer Science
  • Proceedings of Data Compression Conference - DCC…
  • 31 March 1996
The importance of training text for PPM is demonstrated, showing that its performance can be improved by "adjusting" the alphabet used, and the results based on these improvements are given. Expand
Analysis of an algorithm for fast ray tracing using uniform space subdivision
A theoretical analysis of the algorithm used to speed upRay tracing by means of space subdivision, which shows how the space and time requirements vary with the number of objects in the scene. Expand