Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments
@article{Dong2021RobustNR, title={Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments}, author={Siyan Dong and Qingnan Fan and He Wang and Ji Shi and Li Yi and Thomas A. Funkhouser and Baoquan Chen and Leonidas J. Guibas}, journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2021}, pages={8540-8550} }
Localizing the camera in a known indoor environment is a key building block for scene mapping, robot navigation, AR, etc. Recent advances estimate the camera pose via optimization over the 2D/3D-3D correspondences established between the coordinates in 2D/3D camera space and 3D world space. Such a mapping is estimated with either a convolution neural network or a decision tree using only the static input image sequence, which makes these approaches vulnerable to dynamic indoor environments that…
Figures and Tables from this paper
4 Citations
Visually plausible human-object interaction capture from wearable sensors
- Computer ScienceArXiv
- 2022
HOPS is the first method to capture interactions such as dragging objects and opening doors from ego-centric data alone, allowing to track objects even when they are not visible from the head camera.
CrowdDriven: A New Challenging Dataset for Outdoor Visual Localization
- Computer Science2021 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2021
This work proposes a new benchmark for visual localization in outdoor scenes, using crowd-sourced data to cover a wide range of geographical regions and camera devices with a focus on the failure cases of current algorithms.
Multi-Modal Visual Place Recognition in Dynamics-Invariant Perception Space
- Computer ScienceIEEE Signal Processing Letters
- 2021
This letter for the first time explores the use of multi-modal fusion of semantic and visual modalities in dynamics-invariant space to improve place recognition in dynamic environments by first designing a novel deep learning architecture to generate the static semantic segmentation and recover the static image directly from the corresponding dynamic image.
Projective Manifold Gradient Layer for Deep Rotation Regression
- Computer Science, MathematicsArXiv
- 2021
The proposed regularized projective manifold gradient (RPMG) method helps networks achieve new state-of-the-art performance in a variety of rotation estimation tasks and can be applied to other smooth manifolds such as the unit sphere.
References
SHOWING 1-10 OF 60 REFERENCES
Backtracking regression forests for accurate camera relocalization
- Computer Science2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- 2017
A sample-balanced objective to encourage equal numbers of samples in the left and right sub-trees, and a novel backtracking scheme to remedy the incorrect 2D-3D correspondence predictions are proposed.
Geometry-Aware Learning of Maps for Camera Localization
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
This work proposes to represent maps as a deep neural net called MapNet, which enables learning a data-driven map representation and proposes a novel parameterization for camera rotation which is better suited for deep-learning based camera pose regression.
Full-Frame Scene Coordinate Regression for Image-Based Localization
- Computer ScienceRobotics: Science and Systems
- 2018
This paper proposes to perform the scene coordinate regression in a full-frame manner to make the computation efficient at test time and to add more global context to the regression process to improve the robustness.
Learning Less is More - 6D Camera Localization via 3D Surface Regression
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
This work addresses the task of predicting the 6D camera pose from a single RGB image in a given 3D environment by developing a fully convolutional neural network for densely regressing so-called scene coordinates, defining the correspondence between the input image and the 3D scene space.
SANet: Scene Agnostic Network for Camera Localization
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
This paper presents a scene agnostic neural architecture for camera localization, where model parameters and scenes are independent from each other, and predicts a dense scene coordinate map of a query RGB image on-the-fly given an arbitrary scene.
Random forests versus Neural Networks — What's best for camera localization?
- Computer Science2017 IEEE International Conference on Robotics and Automation (ICRA)
- 2017
The experimental findings show that for scene coordinate regression, traditional NN architectures are superior to test-time efficient RFs and ForestNets, however, this does not translate to final 6D camera pose accuracy where RFsAndForestNets perform slightly better.
Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images
- Environmental Science, Computer Science2013 IEEE Conference on Computer Vision and Pattern Recognition
- 2013
We address the problem of inferring the pose of an RGB-D camera relative to a known 3D scene, given only a single acquired image. Our approach employs a regression forest that is capable of inferring…
On-the-Fly Adaptation of Regression Forests for Online Camera Relocalisation
- Computer Science2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017
This paper shows how to circumvent this limitation by adapting a pre-trained forest to a new scene on the fly, and achieves relocalisation performance that is on par with that of offline forests, and the approach runs in under 150ms, making it desirable for real-time systems that require online Relocalisation.
PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization
- Computer Science2015 IEEE International Conference on Computer Vision (ICCV)
- 2015
This work trains a convolutional neural network to regress the 6-DOF camera pose from a single RGB image in an end-to-end manner with no need of additional engineering or graph optimisation, demonstrating that convnets can be used to solve complicated out of image plane regression problems.
Geometric Loss Functions for Camera Pose Regression with Deep Learning
- Computer Science2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017
A number of novel loss functions for learning camera pose which are based on geometry and scene reprojection error are explored, and it is shown how to automatically learn an optimal weighting to simultaneously regress position and orientation.