# Solving the Blind Perspective-n-Point Problem End-To-End With Robust Differentiable Geometric Optimization

@article{Campbell2020SolvingTB, title={Solving the Blind Perspective-n-Point Problem End-To-End With Robust Differentiable Geometric Optimization}, author={Dylan Campbell and Liu Liu and Stephen Gould}, journal={ArXiv}, year={2020}, volume={abs/2007.14628} }

Blind Perspective-n-Point (PnP) is the problem of estimating the position and orientation of a camera relative to a scene, given 2D image points and 3D scene points, without prior knowledge of the 2D-3D correspondences. Solving for pose and correspondences simultaneously is extremely challenging since the search space is very large. Fortunately it is a coupled problem: the pose can be found easily given the correspondences and vice versa. Existing approaches assume that noisy correspondences… Expand

#### 3 Citations

PlueckerNet: Learn to Register 3D Line Reconstructions

- Computer Science
- ArXiv
- 2020

Experiments on both indoor and outdoor datasets show that the registration (rotation and translation) precision of the neural network based method outperforms baselines significantly. Expand

Back to the Feature: Learning Robust Camera Localization from Pixels to Pose

- Computer Science
- ArXiv
- 2021

PixLoc, a sceneagnostic neural network that estimates an accurate 6-DoF pose from an image and a 3D model, is introduced, based on the direct alignment of multiscale deep features, casting camera localization as metric learning. Expand

To The Point: Correspondence-driven monocular 3D category reconstruction

- Computer Science
- ArXiv
- 2021

To The Point (TTP), a method for reconstructing 3D objects from a single image using 2D to 3D correspondences learned from weak supervision, uses a simple per-sample optimization problem to replace CNN-based regression of camera pose and non-rigid deformation and thereby obtain substantially more accurate 3D reconstructions. Expand

#### References

SHOWING 1-10 OF 53 REFERENCES

The Alignment of the Spheres: Globally-Optimal Spherical Mixture Alignment for Camera Pose Estimation

- Computer Science
- 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019

This work casts the problem as a 2D-3D mixture model alignment task and proposes the first globally-optimal solution to this formulation under the robust L2 distance between mixture distributions, guaranteeing global optimality without requiring a pose estimate. Expand

Learning 2D-3D Correspondences To Solve The Blind Perspective-n-Point Problem

- Computer Science
- ArXiv
- 2020

A deep CNN model which simultaneously solves for both the 6-DoF absolute camera pose and 2D--3D correspondences and is capable of processing thousands of points a second with the state-of-the-art accuracy. Expand

Globally-Optimal Inlier Set Maximisation for Camera Pose and Correspondence Estimation

- Computer Science, Medicine
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- 2020

This work proposes a robust and globally-optimal inlier set maximisation approach that jointly estimates the optimal camera pose and correspondences, and outperforms existing approaches on challenging synthetic and real datasets, reliably finding the global optimum. Expand

Pose Priors for Simultaneously Solving Alignment and Correspondence

- Mathematics, Computer Science
- ECCV
- 2008

This paper models the camera pose space as a Gaussian Mixture Model that is progressively refine by hypothesizing new correspondences, which rapidly reduces the number of potential matches for each 3D point and lets us explore the pose space more thoroughly than SoftPosit at a similar computational cost. Expand

Understanding the Limitations of CNN-Based Absolute Camera Pose Regression

- Computer Science
- 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019

A theoretical model for camera pose regression is developed that is more closely related to pose approximation via image retrieval than to accurate pose estimation via 3D structure, and shows that additional research is needed before pose regression algorithms are ready to compete with structure-based methods. Expand

EPnP: An Accurate O(n) Solution to the PnP Problem

- Mathematics, Computer Science
- International Journal of Computer Vision
- 2008

A non-iterative solution to the PnP problem—the estimation of the pose of a calibrated camera from n 3D-to-2D point correspondences—whose computational complexity grows linearly with n, which can be done in O(n) time by expressing these coordinates as weighted sum of the eigenvectors of a 12×12 matrix. Expand

A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation

- Mathematics, Computer Science
- CVPR 2011
- 2011

This paper proposes a novel closed-form solution to the P3P problem, which computes the aligning transformation directly in a single stage, without the intermediate derivation of the points in the camera frame, at much lower computational cost. Expand

Learning to Find Good Correspondences

- Computer Science
- 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018

A novel normalization technique, called Context Normalization, is introduced, which allows the network to process each data point separately while embedding global information in it, and also makes the network invariant to the order of the correspondences. Expand

SoftPOSIT: Simultaneous Pose and Correspondence Determination

- Computer Science
- International Journal of Computer Vision
- 2004

A new algorithm, called SoftPOSIT, for determining the pose of a 3D object from a single 2D image when correspondences between object points and image points are not known, which has an asymptotic run-time complexity that is better than previous methods by a factor of the number of image points. Expand

DSAC — Differentiable RANSAC for Camera Localization

- Computer Science, Mathematics
- 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017

DSAC is applied to the problem of camera localization, where deep learning has so far failed to improve on traditional approaches, and it is demonstrated that by directly minimizing the expected loss of the output camera poses, robustly estimated by RANSAC, it achieves an increase in accuracy. Expand