Learning to Find Good Correspondences

@article{Yi2018LearningTF,
  title={Learning to Find Good Correspondences},
  author={Kwang Moo Yi and Eduard Trulls and Yuki Ono and Vincent Lepetit and Mathieu Salzmann and Pascal V. Fua},
  journal={2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2018},
  pages={2666-2674}
}
  • K. M. Yi, Eduard Trulls, P. Fua
  • Published 16 November 2017
  • Computer Science
  • 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
We develop a deep architecture to learn to find good correspondences for wide-baseline stereo. [] Key Method Our architecture is based on a multi-layer perceptron operating on pixel coordinates rather than directly on the image, and is thus simple and small. We introduce a novel normalization technique, called Context Normalization, which allows us to process each data point separately while embedding global information in it, and also makes the network invariant to the order of the correspondences. Our…

Figures from this paper

D2D: Learning to find good correspondences for image matching and manipulation
TLDR
A simple approach to determining correspondences between image pairs under large changes in illumination, viewpoint, context, and material and can be used to achieve state of the art or competitive results on a wide range of tasks: local matching, camera localization, 3D reconstruction, and image stylization.
COTR: Correspondence Transformer for Matching Across Images
TLDR
A novel framework for finding correspondences in images based on a deep neural network that, given two images and a query point in one of them, finds its correspondence in the other, yielding a multiscale pipeline able to provide highly-accurate correspondences.
Learning Two-View Correspondences and Geometry Using Order-Aware Network
TLDR
This paper proposes Order-Aware Network, which infers the probabilities of correspondences being inliers and regresses the relative pose encoded by the essential matrix, and is built hierarchically and comprises three novel operations.
LF-Net: Learning Local Features from Images
TLDR
A novel deep architecture and a training strategy to learn a local feature pipeline from scratch, using collections of images without the need for human supervision, and shows that it can optimize the network in a two-branch setup by confining it to one branch, while preserving differentiability in the other.
Deep Kernelized Dense Geometric Matching
TLDR
This work proposes to formulate global correspondence estimation as a continuous probabilistic regression task using deep kernels, yielding a novel approach to learning dense correspondences.
Scene Coordinate and Correspondence Learning for Image-Based Localization
TLDR
This work proposes to regress confidences of scene coordinates pixel-wise for a given RGB image by using deep learning, which allows us to immediately discard erroneous predictions and improve the initial pose estimates.
S2DNet: Learning Image Features for Accurate Sparse-to-Dense Matching
TLDR
S2DNet is introduced, a novel feature matching pipeline designed and trained to efficiently establish both robust and accurate correspondences, and achieves state-of-theart results on the HPatches benchmark, as well as on several long-term visual localization datasets.
Learning To Find Good Correspondences Of Multiple Objects
TLDR
This paper discretizes the 3D rotation space into twenty convex cones based on the facets of a regular icosahedron and proposes a deep architecture to simultaneously label the correspondences as inliers or outliers and classify the inlier into multiple objects.
Deep Keypoint-Based Camera Pose Estimation with Geometric Constraints
TLDR
This paper designs an end-to-end trainable framework consisting of learnable modules for detection, feature extraction, matching and outlier rejection, while directly optimizing for the geometric pose objective, and shows both quantitatively and qualitatively that pose estimation performance may be achieved on par with the classic pipeline.
Learning Bipartite Graph Matching for Robust Visual Localization
TLDR
This work proposes a novel method to deal with 2D-3D matching in a very robust way using a bipartite graph and a deep neural network, referred to as Bipartite Graph Network (BGNet), to extract the global geometric information.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 39 REFERENCES
DeMoN: Depth and Motion Network for Learning Monocular Stereo
TLDR
This work trains a convolutional network end-to-end to compute depth and camera motion from successive, unconstrained image pairs, and in contrast to the popular depth-from-single-image networks, DeMoN learns the concept of matching and better generalizes to structures not seen during training.
Discriminative Learning of Deep Convolutional Feature Point Descriptors
TLDR
This paper uses Convolutional Neural Networks to learn discriminant patch representations and in particular train a Siamese network with pairs of (non-)corresponding patches to develop 128-D descriptors whose euclidean distances reflect patch similarity and can be used as a drop-in replacement for any task involving SIFT.
DSAC — Differentiable RANSAC for Camera Localization
TLDR
DSAC is applied to the problem of camera localization, where deep learning has so far failed to improve on traditional approaches, and it is demonstrated that by directly minimizing the expected loss of the output camera poses, robustly estimated by RANSAC, it achieves an increase in accuracy.
Toward Geometric Deep SLAM
TLDR
A point tracking system powered by two deep convolutional neural networks that are trained with simple synthetic data, alleviating the requirement of expensive external camera ground truthing and advanced graphics rendering pipelines.
SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels
TLDR
SUN3D, a large-scale RGB-D video database with camera pose and object labels, capturing the full 3D extent of many places is introduced, and a generalization of bundle adjustment that incorporates object-to-object correspondences is introduced.
Finding Matches in a Haystack: A Max-Pooling Strategy for Graph Matching in the Presence of Outliers
TLDR
A max-pooling approach to graph matching is proposed, which is not only resilient to deformations but also remarkably tolerant to outliers.
Unsupervised Learning of Depth and Ego-Motion from Video
TLDR
Empirical evaluation demonstrates the effectiveness of the unsupervised learning framework for monocular depth performs comparably with supervised methods that use either ground-truth pose or depth for training, and pose estimation performs favorably compared to established SLAM systems under comparable input settings.
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
TLDR
This paper designs a novel type of neural network that directly consumes point clouds, which well respects the permutation invariance of points in the input and provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing.
Generic 3D Representation via Pose Estimation and Matching
TLDR
This paper empirically shows that the internal representation of a multi-task ConvNet trained to solve the above core problems generalizes to novel 3D tasks without the need for fine-tuning and shows traits of abstraction abilities.
LIFT: Learned Invariant Feature Transform
TLDR
This work introduces a novel Deep Network architecture that implements the full feature point handling pipeline, that is, detection, orientation estimation, and feature description, and shows how to learn to do all three in a unified manner while preserving end-to-end differentiability.
...
1
2
3
4
...