• Corpus ID: 195848289

UnsuperPoint: End-to-end Unsupervised Interest Point Detector and Descriptor

  title={UnsuperPoint: End-to-end Unsupervised Interest Point Detector and Descriptor},
  author={Peter Hviid Christiansen and Mikkel Fly Kragh and Yury Brodskiy and Henrik Karstoft},
It is hard to create consistent ground truth data for interest points in natural images, since interest points are hard to define clearly and consistently for a human annotator. This makes interest point detectors non-trivial to build. In this work, we introduce an unsupervised deep learning-based interest point detector and descriptor. Using a self-supervised approach, we utilize a siamese network and a novel loss function that enables interest point scores and positions to be learned… 


The proposed self-supervised keypoint learning method greatly improves the quality of feature matching and homography estimation on challenging benchmarks over the state-of-the-art.

DDM-NET: End-to-end learning of keypoint feature Detection, Description and Matching for 3D localization

An end-to-end framework that jointly learns keypoint detection, descriptor representation and cross-frame matching for the task of image-based 3D localization is proposed, able to yield more accurate localization that out-performs both traditional methods as well as state-of-the-art weakly supervised methods.

Neural Outlier Rejection for Self-Supervised Keypoint Learning

This work proposes a novel end-to-end self-supervised learning scheme that can effectively exploit unlabeled data to provide more reliable keypoints under various scene conditions and greatly improves the quality of feature matching and homography estimation on challenging benchmarks over the state-of-the-art.

SOLD2: Self-supervised Occlusion-aware Line Description and Detection

This work proposes the first joint detection and description of line segments in a single deep network, which is highly discriminative, while remaining robust to viewpoint changes and occlusions.

Self-Supervised 3D Keypoint Learning for Ego-motion Estimation

This work proposes a fully self-supervised approach towards learning depth-aware keypoints from unlabeled videos by incorporating a differentiable pose estimation module that jointly optimizes the keypoints and their depths in a Structure-from-Motion setting.

Soft Expectation and Deep Maximization for Image Feature Detection

This work proposes SEDM, an iterative semi-supervised learning process that flips the question and first looks for repeatable 3D points, then trains a detector to localize them in image space, and applies this detector to standard benchmarks in visual localization, sparse 3D reconstruction, and mean matching accuracy.

Digging Into Self-Supervised Learning of Feature Descriptors

This work proposes a coarse-to-fine method for mining local hard negatives from a wider search space by using global visual image descriptors and demonstrates that a combination of synthetic homography transformation, color augmentation, and photorealistic image stylization produces useful representations that are viewpoint and illumination invariant.

P2-Net: Joint Description and Detection of Local Features for Pixel and Point Matching

A dual fully-convolutional framework is presented that maps 2D and 3D inputs into a shared latent representation space to simultaneously describe and detect keypoints and achieves state-of-the-art results for the task of indoor visual localization.

ASLFeat: Learning Local Features of Accurate Shape and Localization

  • Zixin LuoLei Zhou Long Quan
  • Computer Science
    2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2020
This work focuses on mitigating two limitations in the joint learning of local feature detectors and descriptors, by resorting to deformable convolutional networks to densely estimate and apply local transformation in ASLFeat.

D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features

This paper proposes a keypoint selection strategy that overcomes the inherent density variations of 3D point clouds, and proposes a self-supervised detector loss guided by the on-the-fly feature matching results during training.



Quad-Networks: Unsupervised Learning to Rank for Interest Point Detection

This paper is the first to propose such a formulation: training a neural network to rank points in a transformation-invariant manner, and shows that this unsupervised method performs better or on-par with baselines on two tasks.

SuperPoint: Self-Supervised Interest Point Detection and Description

This paper presents a self-supervised framework for training interest point detectors and descriptors suitable for a large number of multiple-view geometry problems in computer vision and introduces Homographic Adaptation, a multi-scale, multi-homography approach for boosting interest point detection repeatability and performing cross-domain adaptation.

PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization

This work trains a convolutional neural network to regress the 6-DOF camera pose from a single RGB image in an end-to-end manner with no need of additional engineering or graph optimisation, demonstrating that convnets can be used to solve complicated out of image plane regression problems.

TILDE: A Temporally Invariant Learned DEtector

We introduce a learning-based approach to detect repeatable keypoints under drastic imaging changes of weather and lighting conditions to which state-of-the-art keypoint detectors are surprisingly

LF-Net: Learning Local Features from Images

A novel deep architecture and a training strategy to learn a local feature pipeline from scratch, using collections of images without the need for human supervision, and shows that it can optimize the network in a two-branch setup by confining it to one branch, while preserving differentiability in the other.

DeepVO: Towards end-to-end visual odometry with deep Recurrent Convolutional Neural Networks

Extensive experiments on the KITTI VO dataset show competitive performance to state-of-the-art methods, verifying that the end-to-end Deep Learning technique can be a viable complement to the traditional VO systems.

Discriminative Learning of Deep Convolutional Feature Point Descriptors

This paper uses Convolutional Neural Networks to learn discriminant patch representations and in particular train a Siamese network with pairs of (non-)corresponding patches to develop 128-D descriptors whose euclidean distances reflect patch similarity and can be used as a drop-in replacement for any task involving SIFT.

Geometric Loss Functions for Camera Pose Regression with Deep Learning

  • Alex KendallR. Cipolla
  • Computer Science
    2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2017
A number of novel loss functions for learning camera pose which are based on geometry and scene reprojection error are explored, and it is shown how to automatically learn an optimal weighting to simultaneously regress position and orientation.

LIFT: Learned Invariant Feature Transform

This work introduces a novel Deep Network architecture that implements the full feature point handling pipeline, that is, detection, orientation estimation, and feature description, and shows how to learn to do all three in a unified manner while preserving end-to-end differentiability.

MatchNet: Unifying feature and metric learning for patch-based matching

A unified approach to combining feature computation and similarity networks for training a patch matching system that improves accuracy over previous state-of-the-art results on patch matching datasets, while reducing the storage requirement for descriptors is confirmed.