Corpus ID: 3104382

Language model size reduction by pruning and clustering

@inproceedings{Goodman2000LanguageMS,
  title={Language model size reduction by pruning and clustering},
  author={Joshua Goodman and Jianfeng Gao},
  booktitle={INTERSPEECH},
  year={2000}
}
  • Joshua Goodman, Jianfeng Gao
  • Published in INTERSPEECH 2000
  • Computer Science
  • Several techniques are known for reducing the size of language models, including count cutoffs [1], Weighted Difference pruning [2], Stolcke pruning [3], and clustering [4]. We compare all of these techniques and show some surprising results. For instance, at low pruning thresholds, Weighted Difference and Stolcke pruning underperform count cutoffs. We then show novel clustering techniques that can be combined with Stolcke pruning to produce the smallest models at a given perplexity. The… CONTINUE READING

    Create an AI-powered research feed to stay up to date with new papers like this posted to ArXiv

    Figures, Tables, and Topics from this paper.

    Explore Further: Topics Discussed in This Paper

    Citations

    Publications citing this paper.
    SHOWING 1-10 OF 39 CITATIONS

    Language Segmentation

    Reducing infrequent-token perplexity via variational corpora

    VIEW 1 EXCERPT
    CITES BACKGROUND

    Survey of data-selection methods in statistical machine translation

    VIEW 2 EXCERPTS
    CITES METHODS & BACKGROUND

    Perplexity on Reduced Corpora

    VIEW 2 EXCERPTS
    CITES BACKGROUND & METHODS

    Runtime Application Behavior Prediction Using a Statistical Metric Model

    VIEW 1 EXCERPT
    CITES METHODS

    Efficient representation and fast look-up of Maximum Entropy language models

    Randomized maximum entropy language models

    VIEW 1 EXCERPT
    CITES BACKGROUND

    References

    Publications referenced by this paper.
    SHOWING 1-5 OF 5 REFERENCES

    Scalable backoff language models

    VIEW 5 EXCERPTS
    HIGHLY INFLUENTIAL

    Multi-class composite N-gram based on connection direction

    • Hirofumi Yamamoto, Yoshinori Sagisaka
    • Computer Science
    • 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)
    • 1999
    VIEW 1 EXCERPT