Aggregation of Binary Feature Descriptors for Compact Scene Model Representation in Large Scale Structure-from-Motion Applications

  title={Aggregation of Binary Feature Descriptors for Compact Scene Model Representation in Large Scale Structure-from-Motion Applications},
  author={Jacek Komorowski and Tomasz Trzciński},
In this paper we present an efficient method for aggregating binary feature descriptors to allow compact representation of 3D scene model in incremental structure-from-motion and SLAM applications. All feature descriptors linked with one 3D scene point or landmark are represented by a single low-dimensional real-valued vector called a prototype. The method allows significant reduction of memory required to store and process feature descriptors in large-scale structure-from-motion applications… 



From structure-from-motion point clouds to fast location recognition

A fast location recognition technique based on structure from motion point clouds is presented, and Vocabulary tree-based indexing of features directly returns relevant fragments of 3D models instead of documents from the images database.

A joint compression scheme for local binary feature descriptors and their corresponding bag-of-words representation

A compression scheme for local binary features is presented, which jointly encodes the descriptors and their respective Bag-of-Words representation using a shared vocabulary between client and server and reduces ORB features to 60.62 % of their uncompressed size.

On aggregation of local binary descriptors

  • S. HusainM. Bober
  • Computer Science
    2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)
  • 2016
A robust global image representation; Binary Robust Visual Descriptor (B-RVD), with rank-based multi-assignment of local descriptors and direction-based aggregation, achieved by the use of L1-norm on residual vectors is proposed.

Coding binary local features extracted from video sequences

This paper proposes a coding architecture specifically designed for binary local features extracted from video content that exploits both spatial and temporal redundancy by means of intra-frame and inter-frame coding modes, showing that significant coding gains can be attained for a target level of accuracy of the visual analysis task.

Minimal Scene Descriptions from Structure from Motion Models

  • Song CaoNoah Snavely
  • Computer Science
    2014 IEEE Conference on Computer Vision and Pattern Recognition
  • 2014
A new method for computing compact models that takes into account both image-point relationships and feature distinctiveness is introduced, and it is shown that this method produces small models that yield better recognition performance than previous model reduction techniques.

Location Recognition Using Prioritized Feature Matching

This work devise an adaptive, prioritized algorithm for matching a representative set of SIFT features covering a large scene to a query image for efficient localization, based on considering features in the scene database, and matching them to query image features, as opposed to more conventional methods that match image features to visual words or database features.

Aggregating binary local descriptors for image retrieval

The results show that aggregation methods on binary features are effective and represent a worthwhile alternative to the direct matching and the combination of the aggregated binary features with the emerging Convolutional Neural Network features.

Rate-accuracy optimization of binary descriptors

This work designs an entropy coding scheme that seeks the internal ordering of the descriptor that minimizes the number of bits necessary to represent it and evaluates the discriminative power of descriptors as a function of rate, in order to investigate the trade-offs in a bandwidth constrained scenario.

Survey of SIFT Compression Schemes

This work performs a comprehensive survey of Scale Invariant Feature Transform (SIFT) compression schemes proposed in the literature and compares them to the recently proposed low bit-rate Compressed Histogram of Gradients (CHoG) descriptor, showing that CHoG outperforms all SIFT compression schemes.

FREAK: Fast Retina Keypoint

This work proposes a novel keypoint descriptor inspired by the human visual system and more precisely the retina, coined Fast Retina Keypoint (FREAK), which is in general faster to compute with lower memory load and also more robust than SIFT, SURF or BRISK.