An Empirical Study of Smoothing Techniques for Language Modeling an Empirical Study of Smoothing Techniques for Language Modeling

@inproceedings{Chen1996AnES,
  title={An Empirical Study of Smoothing Techniques for Language Modeling an Empirical Study of Smoothing Techniques for Language Modeling},
  author={S. F. Chen and Joshua Goodman},
  year={1996}
}
We present a tutorial introduction to n-gram models for language modeling and survey the most widely-used smoothing algorithms for such models. We then present an extensive empirical comparison of several of these smoothing techniques, including those described by We investigate how factors such as training data size, training corpus (e.g., Brown versus Wall Street Journal), count cutoos, and n-gram order (bigram versus trigram) aaect the relative performance of these methods, which is measured… CONTINUE READING
Highly Cited
This paper has 56 citations. REVIEW CITATIONS