Hashing with Generalized Nyström Approximation


Hashing, which involves learning binary codes to embed high-dimensional data into a similarity-preserving low-dimensional Hamming space, is often formulated as linear dimensionality reduction followed by binary quantization. Linear dimensionality reduction, based on maximum variance formulation, requires leading eigenvectors of data covariance or graph Laplacian matrix. Computing leading singular vectors or eigenvectors in the case of high-dimension and large sample size, is a main bottleneck in most of data-driven hashing methods. In this paper we address the use of generalized Nyström method where a subset of rows and columns are used to approximately compute leading singular vectors of the data matrix, in order to improve the scalability of hashing methods in the case of high-dimensional data with large sample size. Especially we validate the useful behavior of generalized Nyström approximation with uniform sampling, in the case of a recentlydeveloped hashing method based on principal component analysis (PCA) followed by an iterative quantization, referred to as PCA+ITQ, developed by Gong and Lazebnik. We compare the performance of generalized Nyström approximation with uniform and non-uniform sampling, to the full singular value decomposition (SVD) method, confirming that the uniform sampling improves the computational and space complexities dramatically, while the performance is not much sacrificed. In addition we present low-rank approximation error bounds for generalized Nyström approximation with uniform sampling, which is not a trivial extension of available results on the nonuniform sampling case. Keywords-CUR decomposition; hashing; generalized Nyström approximation; pseudoskeleton approximation; uniform sampling;

DOI: 10.1109/ICDM.2012.22

Extracted Key Phrases

2 Figures and Tables

Cite this paper

@inproceedings{Yun2012HashingWG, title={Hashing with Generalized Nystr{\"{o}m Approximation}, author={Jeong-Min Yun and Saehoon Kim and Seungjin Choi}, booktitle={ICDM}, year={2012} }