PPLSA: Parallel Probabilistic Latent Semantic Analysis Based on MapReduce

@inproceedings{Li2012PPLSAPP,
  title={PPLSA: Parallel Probabilistic Latent Semantic Analysis Based on MapReduce},
  author={Ning Li and Fuzhen Zhuang and Qing He and Zhongzhi Shi},
  booktitle={Intelligent Information Processing},
  year={2012}
}
PLSA(Probabilistic Latent Semantic Analysis) is a popular topic modeling technique for exploring document collections. Due to the increasing prevalence of large datasets, there is a need to improve the scalability of computation in PLSA. In this paper, we propose a parallel PLSA algorithm called PPLSA to accommodate large corpus collections in the MapReduce framework. Our solution efficiently distributes computation and is relatively simple to implement. 
Related Discussions
This paper has been referenced on Twitter 1 time. VIEW TWEETS

Citations

Publications citing this paper.

Big Data Processing with Probabilistic Latent Semantic Analysis on MapReduce

2014 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery • 2014

Similar Papers

Loading similar papers…