# Learning Binary Codes with Bagging PCA

@inproceedings{Leng2014LearningBC, title={Learning Binary Codes with Bagging PCA}, author={Cong Leng and Jian Cheng and Ting Yuan and Xiao Bai and Hanqing Lu}, booktitle={ECML/PKDD}, year={2014} }

For the eigendecomposition based hashing approaches, the information caught in different dimensions is unbalanced and most of them is typically contained in the top eigenvectors. This often leads to an unexpected phenomenon that longer code does not necessarily yield better performance. This paper attempts to leverage the bootstrap sampling idea and integrate it with PCA, resulting in a new projection method called Bagging PCA, in order to learn effective binary codes. Specifically, a small… Expand

#### Figures, Tables, and Topics from this paper

#### 18 Citations

R2PCAH: Hashing with two-fold randomness on principal projections

- Mathematics, Computer Science
- Neurocomputing
- 2017

A R 2 PCAH framework that conducts two-fold random transformations based on principal projections for hash code learning that shares the advantages of both LSH and PCA based hashing methods is presented. Expand

Learning binary code via PCA of angle projection for image retrieval

- Computer Science, Engineering
- Other Conferences
- 2018

This paper combined the ITQ hashing algorithm with Cosine similarity projection for each dimensions, the angle projection can keep the original structure and more compact with the Cosine-valued, and the effectiveness of the proposed method is validated. Expand

An ensemble diversity approach to supervised binary hashing

- Computer Science, Mathematics
- NIPS
- 2016

This work proposes a much simpler approach to binary hashing, which is faster and trivially parallelizable, but it also improves over the more complex, coupled objective function, and achieves state-of-the-art precision and recall in experiments with image retrieval. Expand

Locality Preserving Hashing based on Random Rotation and Offsets of PCA in Image Retrieval

- Computer Science
- 2018

Manifold-based subspace feature extraction methods have recently been deeply studied in data dimensionality reduction. Inspired by PCA Hashing (PCAH), if the Locality Preserving Projection (LPP) is… Expand

Learning Independent, Diverse Binary Hash Functions: Pruning and Locality

- Mathematics, Computer Science
- 2016 IEEE 16th International Conference on Data Mining (ICDM)
- 2016

This work shows how it is possible to train improved algorithms in datasets orders of magnitude larger than those used by most works on supervised binary hashing, by pruning an ensemble of hash functions, and learning local hash functions. Expand

Unsupervised Ensemble Hashing: Boosting Minimum Hamming Distance

- Computer Science
- IEEE Access
- 2020

An unsupervised ensemble hashing is proposed to improve the ranking accuracy by assembling the diverse hash tables independently in this paper, and it is observed that the higher the accuracy is the larger diversity the base learner has, and the more effective the ensemble method is. Expand

Model Optimization Boosting Framework for Linear Model Hash Learning

- Medicine, Computer Science
- IEEE Transactions on Image Processing
- 2020

This study proposes a self-improvement framework called Model Boost (MoBoost) to improve model parameter optimization for linear-based hashing methods without adding new constraints or penalty terms to obtain better accuracy. Expand

An Ensemble Hashing Framework for Fast Image Retrieval

- Computer Science
- EIDWT
- 2017

This work first uses the weighted matrix to balance the variance of hash bits and then exploit bagging method to inject the diversity among hash tables to exploit ensemble approaches to tackle hashing problem. Expand

Sketching Hashing

- 2015

Recently, hashing based approximate nearest neighbor (ANN) search has attracted much attention. Extensive new algorithms have been developed and successfully applied to different applications.… Expand

Online sketching hashing

- Computer Science
- 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2015

A novel approach to handle these two problems simultaneously based on the idea of data sketching, which can learn hash functions in an online fashion, while needs rather low computational complexity and storage space. Expand

#### References

SHOWING 1-10 OF 32 REFERENCES

Random subspace for binary codes learning in large scale image retrieval

- Computer Science
- SIGIR
- 2014

This work introduces a random subspace strategy to address the limitation of hashing based approximate nearest neighbor search methods, which leads to an unexpected phenomenon that longer hashing code does not necessarily yield better performance. Expand

Supervised hashing with kernels

- Computer Science
- 2012 IEEE Conference on Computer Vision and Pattern Recognition
- 2012

A novel kernel-based supervised hashing model which requires a limited amount of supervised information, i.e., similar and dissimilar data pairs, and a feasible training cost in achieving high quality hashing, and significantly outperforms the state-of-the-arts in searching both metric distance neighbors and semantically similar neighbors is proposed. Expand

Iterative quantization: A procrustean approach to learning binary codes

- Mathematics, Computer Science
- CVPR 2011
- 2011

A simple and efficient alternating minimization scheme for finding a rotation of zero- centered data so as to minimize the quantization error of mapping this data to the vertices of a zero-centered binary hypercube is proposed. Expand

Sequential Projection Learning for Hashing with Compact Codes

- Mathematics, Computer Science
- ICML
- 2010

This paper proposes a novel data-dependent projection learning method such that each hash function is designed to correct the errors made by the previous one sequentially, and shows significant performance gains over the state-of-the-art methods on two large datasets containing up to 1 million points. Expand

Self-taught hashing for fast similarity search

- Computer Science
- SIGIR
- 2010

This paper proposes a novel Self-Taught Hashing (STH) approach to semantic hashing: it first finds the optimal l-bit binary codes for all documents in the given corpus via unsupervised learning, and then train l classifiers via supervised learning to predict the l- bit code for any query document unseen before. Expand

Isotropic Hashing

- Computer Science, Mathematics
- NIPS
- 2012

Experimental results on real data sets show that IsoHash can outperform its counterpart with different variances for different dimensions, which verifies the viewpoint that projections with isotropic variances will be better than those with anisotropic variances. Expand

Hashing with Graphs

- Mathematics, Computer Science
- ICML
- 2011

This paper proposes a novel graph-based hashing method which automatically discovers the neighborhood structure inherent in the data to learn appropriate compact codes and describes a hierarchical threshold learning procedure in which each eigenfunction yields multiple bits, leading to higher search accuracy. Expand

Bagging, Boosting and the Random Subspace Method for Linear Classifiers

- Mathematics, Computer Science
- Pattern Analysis & Applications
- 2002

Simulation studies show that the performance of the combining techniques is strongly affected by the small sample size properties of the base classifier: boosting is useful for large training sample sizes, while bagging and the random subspace method are useful for criticalTraining sample sizes. Expand

Spectral Hashing

- Computer Science, Mathematics
- NIPS
- 2008

The problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can be shown to be NP hard and a spectral method is obtained whose solutions are simply a subset of thresholded eigenvectors of the graph Laplacian. Expand

Harmonious Hashing

- Computer Science
- IJCAI
- 2013

A novel hashing algorithm called Harmonious Hashing is introduced which aims at learning hash functions with low information loss and learns a set of optimized projections to preserve the maximum cumulative energy and meet the constraint of equivalent variance on each dimension as much as possible. Expand