Topic-based language models using EM

  title={Topic-based language models using EM},
  author={Daniel Gildea and Thomas Hofmann},
In this paper, we propose a novel statistical language model to capture topic-related long-range dependencies. Topics are modeled in a latent variable framework in which we also derive an EM algorithm to perform a topic factor decomposition based on a segmented training corpus. The topic model is combined with a standard language model to be used for on-line word prediction. Perplexity results indicate an improvement over previously proposed topic models, which unfortunately has not translated… CONTINUE READING
Highly Influential
This paper has highly influenced 33 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 259 citations. REVIEW CITATIONS
168 Citations
15 References
Similar Papers


Publications citing this paper.
Showing 1-10 of 168 extracted citations

260 Citations

Citations per Year
Semantic Scholar estimates that this publication has 260 citations based on the available data.

See our FAQ for additional information.

Similar Papers

Loading similar papers…