Solving the Blind Perspective-n-Point Problem End-To-End With Robust Differentiable Geometric Optimization

  title={Solving the Blind Perspective-n-Point Problem End-To-End With Robust Differentiable Geometric Optimization},
  author={Dylan Campbell and Liu Liu and Stephen Gould},
Blind Perspective-n-Point (PnP) is the problem of estimating the position and orientation of a camera relative to a scene, given 2D image points and 3D scene points, without prior knowledge of the 2D-3D correspondences. Solving for pose and correspondences simultaneously is extremely challenging since the search space is very large. Fortunately it is a coupled problem: the pose can be found easily given the correspondences and vice versa. Existing approaches assume that noisy correspondences… Expand

Figures and Tables from this paper

PlueckerNet: Learn to Register 3D Line Reconstructions
Experiments on both indoor and outdoor datasets show that the registration (rotation and translation) precision of the neural network based method outperforms baselines significantly. Expand
Back to the Feature: Learning Robust Camera Localization from Pixels to Pose
PixLoc, a sceneagnostic neural network that estimates an accurate 6-DoF pose from an image and a 3D model, is introduced, based on the direct alignment of multiscale deep features, casting camera localization as metric learning. Expand
To The Point: Correspondence-driven monocular 3D category reconstruction
To The Point (TTP), a method for reconstructing 3D objects from a single image using 2D to 3D correspondences learned from weak supervision, uses a simple per-sample optimization problem to replace CNN-based regression of camera pose and non-rigid deformation and thereby obtain substantially more accurate 3D reconstructions. Expand


The Alignment of the Spheres: Globally-Optimal Spherical Mixture Alignment for Camera Pose Estimation
This work casts the problem as a 2D-3D mixture model alignment task and proposes the first globally-optimal solution to this formulation under the robust L2 distance between mixture distributions, guaranteeing global optimality without requiring a pose estimate. Expand
Learning 2D-3D Correspondences To Solve The Blind Perspective-n-Point Problem
A deep CNN model which simultaneously solves for both the 6-DoF absolute camera pose and 2D--3D correspondences and is capable of processing thousands of points a second with the state-of-the-art accuracy. Expand
Globally-Optimal Inlier Set Maximisation for Camera Pose and Correspondence Estimation
This work proposes a robust and globally-optimal inlier set maximisation approach that jointly estimates the optimal camera pose and correspondences, and outperforms existing approaches on challenging synthetic and real datasets, reliably finding the global optimum. Expand
Pose Priors for Simultaneously Solving Alignment and Correspondence
This paper models the camera pose space as a Gaussian Mixture Model that is progressively refine by hypothesizing new correspondences, which rapidly reduces the number of potential matches for each 3D point and lets us explore the pose space more thoroughly than SoftPosit at a similar computational cost. Expand
Understanding the Limitations of CNN-Based Absolute Camera Pose Regression
A theoretical model for camera pose regression is developed that is more closely related to pose approximation via image retrieval than to accurate pose estimation via 3D structure, and shows that additional research is needed before pose regression algorithms are ready to compete with structure-based methods. Expand
EPnP: An Accurate O(n) Solution to the PnP Problem
A non-iterative solution to the PnP problem—the estimation of the pose of a calibrated camera from n 3D-to-2D point correspondences—whose computational complexity grows linearly with n, which can be done in O(n) time by expressing these coordinates as weighted sum of the eigenvectors of a 12×12 matrix. Expand
A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation
This paper proposes a novel closed-form solution to the P3P problem, which computes the aligning transformation directly in a single stage, without the intermediate derivation of the points in the camera frame, at much lower computational cost. Expand
Learning to Find Good Correspondences
A novel normalization technique, called Context Normalization, is introduced, which allows the network to process each data point separately while embedding global information in it, and also makes the network invariant to the order of the correspondences. Expand
SoftPOSIT: Simultaneous Pose and Correspondence Determination
A new algorithm, called SoftPOSIT, for determining the pose of a 3D object from a single 2D image when correspondences between object points and image points are not known, which has an asymptotic run-time complexity that is better than previous methods by a factor of the number of image points. Expand
DSAC — Differentiable RANSAC for Camera Localization
DSAC is applied to the problem of camera localization, where deep learning has so far failed to improve on traditional approaches, and it is demonstrated that by directly minimizing the expected loss of the output camera poses, robustly estimated by RANSAC, it achieves an increase in accuracy. Expand