Back-off language model compression

  title={Back-off language model compression},
  author={Boulos Harb and Ciprian Chelba and Jeffrey Dean and Sanjay Ghemawat},
With the availability of large amounts of training data relevant to speech recognition scenarios, scalability becomes a very productive way to improve language model performance. We present a technique that represents a back-off n-gram language model using arrays of integer values and thus renders it amenable to effective block compression. We propose a few such compression algorithms and evaluate the resulting language model along two dimensions: memory footprint, and speed reduction relative… CONTINUE READING
Highly Cited
This paper has 17 citations. REVIEW CITATIONS


Publications citing this paper.


Publications referenced by this paper.
Showing 1-9 of 9 references

The CMU statistical language modeling toolkit and its use in the 1994 ARPA CSR evaluation

R. Rosenfeld
Proceedings of the Spoken Language Systems Technology Workshop, 1995, pp. 47–50. • 1995
View 2 Excerpts
Highly Influenced

Similar Papers

Loading similar papers…