HTM: A Topic Model for Hypertexts

Abstract

Previously topic models such as PLSI (Probabilistic Latent Semantic Indexing) and LDA (Latent Dirichlet Allocation) were developed for modeling the contents of plain texts. Recently, topic models for processing hypertexts such as web pages were also proposed. The proposed hypertext models are generative models giving rise to both words and hyperlinks. This… (More)

Topics

10 Figures and Tables

Cite this paper

@inproceedings{Sun2008HTMAT, title={HTM: A Topic Model for Hypertexts}, author={Congkai Sun and Bin Gao and Zhenfu Cao and Hang Li}, booktitle={EMNLP}, year={2008} }