Corpus ID: 14201771

N-gram distribution based language model adaptation

@inproceedings{Gao2000NgramDB,
  title={N-gram distribution based language model adaptation},
  author={Jianfeng Gao and Mingjing Li and Kai-Fu Lee},
  booktitle={INTERSPEECH},
  year={2000}
}
  • Jianfeng Gao, Mingjing Li, Kai-Fu Lee
  • Published in INTERSPEECH 2000
  • Computer Science
  • This paper presents two techniques for language model (LM) adaptation. The first aims to build a more general LM. We propose a distribution-based pruning of n-gram LMs, where we prune n-grams that are likely to be infrequent in a new document. Experimental results show that the distribution-based pruning method performed up to 9% (word perplexity reduction) better than conventional cutoff methods. Moreover, the pruning method results in a more general ngram backoff model, in spite of the domain… CONTINUE READING

    Create an AI-powered research feed to stay up to date with new papers like this posted to ArXiv

    Figures, Tables, and Topics from this paper.

    Explore Further: Topics Discussed in This Paper

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 10 REFERENCES

    A unified approach to statistical language modeling for Chinese

    VIEW 2 EXCERPTS

    Language model adaptation using mixtures and an exponentially decaying cache

    VIEW 2 EXCERPTS

    Using out-of-domain data to improve in-domain language models

    VIEW 2 EXCERPTS