Sparse Overcomplete Latent Variable Decomposition of Counts Data

Abstract

An important problem in many fields is the analysis of counts data to extract meaningful latent components. Methods like Probabilistic Latent Semantic Analysis (PLSA) and Latent Dirichlet Allocation (LDA) have been proposed for this purpose. However, they are limited in the number of components they can extract and lack an explicit provision to control the “expressiveness” of the extracted components. In this paper, we present a learning formulation to address these limitations by employing the notion of sparsity. We start with the PLSA framework and use an entropic prior in a maximum a posteriori formulation to enforce sparsity. We show that this allows the extraction of overcomplete sets of latent components which better characterize the data. We present experimental evidence of the utility of such representations.

Extracted Key Phrases

7 Figures and Tables

Statistics

05101520072008200920102011201220132014201520162017
Citations per Year

73 Citations

Semantic Scholar estimates that this publication has 73 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Shashanka2007SparseOL, title={Sparse Overcomplete Latent Variable Decomposition of Counts Data}, author={Madhusudana V. S. Shashanka and Bhiksha Raj and Paris Smaragdis}, booktitle={NIPS}, year={2007} }