Online `1-Dictionary Learning with Application to Novel Document Detection

Abstract

Given their pervasive use, social media, such as Twitter, have become a leading source of breaking news. A key task in the automated identification of such news is the detection of novel documents from a voluminous stream of text documents in a scalable manner. Motivated by this challenge, we introduce the problem of online `1-dictionary learning where unlike traditional dictionary learning, which uses squared loss, the `1-penalty is used for measuring the reconstruction error. We present an efficient online algorithm for this problem based on alternating directions method of multipliers, and establish a sublinear regret bound for this algorithm. Empirical results on news-stream and Twitter data, shows that this online `1-dictionary learning algorithm for novel document detection gives more than an order of magnitude speedup over the previously known batch algorithm, without any significant loss in quality of results. Our algorithm for online `1dictionary learning could be of independent interest.

Extracted Key Phrases

4 Figures and Tables

Cite this paper

@inproceedings{Kasiviswanathan2012OnlineL, title={Online `1-Dictionary Learning with Application to Novel Document Detection}, author={Shiva Prasad Kasiviswanathan and Huahua Wang and Arindam Banerjee}, year={2012} }