Summarization by Latent Dirichlet Allocation: Superior Sentence Extraction through Topic Modeling

  title={Summarization by Latent Dirichlet Allocation: Superior Sentence Extraction through Topic Modeling},
  author={Kenton W. Murray},
Latent Dirichlet allocation, or LDA, is a successful, generative, probabilistic model of text corpora that has performed well in many tasks in many areas of Natural Language Processing. Despite being perfectly suited for Automatic Summarization tasks, it has never been applied to them. In this paper, I introduce Summarization by LDA, or SLDA, which better models the subtopics of a document leading to more pertinent, relevant, and concise summaries than other summarization methods. This new… 
