GeoDesc: Learning Local Descriptors by Integrating Geometry Constraints

@inproceedings{Luo2018GeoDescLL,
  title={GeoDesc: Learning Local Descriptors by Integrating Geometry Constraints},
  author={Zixin Luo and Tianwei Shen and Lei Zhou and Siyu Zhu and Runze Zhang and Yao Yao and Tian Fang and Long Quan},
  booktitle={ECCV},
  year={2018}
}
Learned local descriptors based on Convolutional Neural Networks (CNNs) have achieved significant improvements on patch-based benchmarks, whereas not having demonstrated strong generalization ability on recent benchmarks of image-based 3D reconstruction. In this paper, we mitigate this limitation by proposing a novel local descriptor learning approach that integrates geometry constraints from multi-view reconstructions, which benefits the learning process in terms of data generation, data… 
D2-Net: A Trainable CNN for Joint Description and Detection of Local Features
TLDR
This work proposes an approach where a single convolutional neural network plays a dual role: It is simultaneously a dense feature descriptor and a feature detector, and shows that this model can be trained using pixel correspondences extracted from readily available large-scale SfM reconstructions, without any further annotations.
D2-Net: A Trainable CNN for Joint Detection and Description of Local Features
TLDR
This work proposes an approach where a single convolutional neural network plays a dual role: It is simultaneously a dense feature descriptor and a feature detector, and shows that this model can be trained using pixel correspondences extracted from readily available large-scale SfM reconstructions, without any further annotations.
HDD-Net: Hybrid Detector Descriptor with Mutual Interactive Learning
TLDR
This work forms the classical hard-mining triplet loss as a new detector optimisation term to refine candidate positions based on the descriptor map and proposes a dense descriptor that uses a multi-scale approach and a hybrid combination of hand-crafted and learned features to obtain rotation and scale robustness by design.
D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features
TLDR
This paper proposes a keypoint selection strategy that overcomes the inherent density variations of 3D point clouds, and proposes a self-supervised detector loss guided by the on-the-fly feature matching results during training.
ASLFeat: Learning Local Features of Accurate Shape and Localization
  • Zixin Luo, Lei Zhou, Long Quan
  • Computer Science
    2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2020
TLDR
This work focuses on mitigating two limitations in the joint learning of local feature detectors and descriptors, by resorting to deformable convolutional networks to densely estimate and apply local transformation in ASLFeat.
Extremely Dense Point Correspondences Using a Learned Feature Descriptor
TLDR
This work presents an effective self-supervised training scheme and novel loss design for dense descriptor learning and demonstrates that the proposed dense descriptor can generalize to unseen patients and scopes, thereby largely improving the performance of Structure from Motion (SfM) in terms of model density and completeness.
Learning to Guide Local Feature Matches
TLDR
A learning-based approach to guide local feature matches via a learned approximate image matching can boost the results of SIFT to a level similar to state-of-the-art deep descriptors, such as Superpoint, ContextDesc, or D2-Net and can improve performance for these descriptors.
SEKD: Self-Evolving Keypoint Detection and Description
TLDR
A self-supervised framework, namely self-evolving keypoint detection and description (SEKD), is proposed to learn an advanced local feature model from unlabeled natural images that outperforms popular hand-crafted and DNN-based methods by remarkable margins.
Learning Feature Descriptors using Camera Pose Supervision
TLDR
This paper proposes a novel weakly-supervised framework that can learn feature descriptors solely from relative camera poses between images, and designs both a new loss function that exploits the epipolar constraint given by camera poses and a new model architecture that makes the whole pipeline differentiable and efficient.
HyNet: Learning Local Descriptor with Hybrid Similarity Measure and Triplet Loss
TLDR
HyNet is proposed, a new local descriptor that leads to state-of-the-art results in matching and surpasses previous methods by a significant margin on standard benchmarks that include patch matching, verification, and retrieval, as well as outperforming full end-to-end methods on 3D reconstruction tasks.
...
...

References

SHOWING 1-10 OF 42 REFERENCES
PN-Net: Conjoined Triple Deep Network for Learning Local Image Descriptors
TLDR
This paper proposes a CNN based descriptor with improved matching performance, significantly reduced training and execution time, as well as low dimensionality, and introduces a new loss function that exploits the relations within the triplets.
Discriminative Learning of Deep Convolutional Feature Point Descriptors
TLDR
This paper uses Convolutional Neural Networks to learn discriminant patch representations and in particular train a Siamese network with pairs of (non-)corresponding patches to develop 128-D descriptors whose euclidean distances reflect patch similarity and can be used as a drop-in replacement for any task involving SIFT.
HPatches: A Benchmark and Evaluation of Handcrafted and Learned Local Descriptors
TLDR
A novel benchmark for evaluating local image descriptors is proposed and it is shown that a simple normalisation of traditional hand-crafted descriptors can boost their performance to the level of deep learning based descriptors within a realistic benchmarks evaluation.
Learning local feature descriptors with triplets and shallow convolutional neural networks
TLDR
This work proposes to utilize triplets of training samples, together with in-triplet mining of hard negatives, and shows that this method achieves state of the art results, without the computational overhead typically associated with mining of negatives and with lower complexity of the network architecture.
L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space
TLDR
The good generalization ability shown by experiments indicates that L2-Net can serve as a direct substitution of the existing handcrafted descriptors as well as a progressive sampling strategy which enables the network to access billions of training samples in a few epochs.
Learning to Assign Orientations to Feature Points
TLDR
This work shows how to train a Convolutional Neural Network to assign a canonical orientation to feature points given an image patch centered on the feature point, and proposes a new type of activation function for Neural Networks that generalizes the popular ReLU, maxout, and PReLU activation functions.
A Large Dataset for Improving Patch Matching
TLDR
Experimental evaluations show that the descriptors trained using the proposed dataset outperform the current state-of-the-art descriptors training on MVS by 8%, 4% and 10% on matching, verification and retrieval tasks respectively on the HPatches dataset and Strecha dataset.
Learning Local Image Descriptors with Deep Siamese and Triplet Convolutional Networks by Minimizing Global Loss Functions
TLDR
A combination of the triplet and global losses produces the best embedding in the field, using this triplet network, and it is demonstrated that the use of the central-surround siamese network trained with the global loss producing the best result of the field on the UBC dataset.
MVSNet: Depth Inference for Unstructured Multi-view Stereo
TLDR
This work presents an end-to-end deep learning architecture for depth map inference from multi-view images that flexibly adapts arbitrary N-view inputs using a variance-based cost metric that maps multiple features into one cost feature.
MatchNet: Unifying feature and metric learning for patch-based matching
TLDR
A unified approach to combining feature computation and similarity networks for training a patch matching system that improves accuracy over previous state-of-the-art results on patch matching datasets, while reducing the storage requirement for descriptors is confirmed.
...
...