Corpus ID: 46932131

Topic Modelling of Empirical Text Corpora: Validity, Reliability, and Reproducibility in Comparison to Semantic Maps

@article{Hecking2018TopicMO,
  title={Topic Modelling of Empirical Text Corpora: Validity, Reliability, and Reproducibility in Comparison to Semantic Maps},
  author={Tobias Hecking and Loet Leydesdorff},
  journal={ArXiv},
  year={2018},
  volume={abs/1806.01045}
}
  • Tobias Hecking, Loet Leydesdorff
  • Published 2018
  • Computer Science
  • ArXiv
  • Using the 6,638 case descriptions of societal impact submitted for evaluation in the Research Excellence Framework (REF 2014), we replicate the topic model (Latent Dirichlet Allocation or LDA) made in this context and compare the results with factor-analytic results using a traditional word-document matrix (Principal Component Analysis or PCA). Removing a small fraction of documents from the sample, for example, has on average a much larger impact on LDA than on PCA-based models to the extent… CONTINUE READING

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 33 REFERENCES
    Evaluating topic models with stability
    16
    How Many Topics? Stability Analysis for Topic Models
    90
    TopicNets: Visual Analysis of Large Text Corpora with Topic Modeling
    146