Degeneracy in Self-Calibration Revisited and a Deep Learning Solution for Uncalibrated SLAM

@article{Zhuang2019DegeneracyIS,
  title={Degeneracy in Self-Calibration Revisited and a Deep Learning Solution for Uncalibrated SLAM},
  author={Bingbing Zhuang and Quoc-Huy Tran and Pan Ji and Gim Hee Lee and Loong Fah Cheong and Manmohan Chandraker},
  journal={2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year={2019},
  pages={3766-3773}
}
Self-calibration of camera intrinsics and radial distortion has a long history of research in the computer vision community. However, it remains rare to see real applications of such techniques to modern Simultaneous Localization And Mapping (SLAM) systems, especially in driving scenarios. In this paper, we revisit the geometric approach to this problem, and provide a theoretical proof that explicitly shows the ambiguity between radial distortion and scene depth when two-view geometry is used… 

Pseudo RGB-D for Self-Improving Monocular SLAM and Depth Prediction

This paper proposes a joint narrow and wide baseline based self-improving framework, where on the one hand the CNN-predicted depth is leveraged to perform pseudo RGB-D feature-based SLAM, leading to better accuracy and robustness than the monocular RGB SLAM baseline.

Learned Intrinsic Auto-Calibration From Fundamental Matrices

This work proposes to solve for the intrinsic calibration parameters using a neural network that is trained on a synthetic Unity dataset that is created, which outperforms traditional methods by 2% to 30%, and outperforms recent deep learning approaches by a factor of 2 to 4 times.

Transformers in Self-Supervised Monocular Depth Estimation with Unknown Camera Intrinsics

This study demonstrates how transformer-based architecture, though lower in run-time efficiency, achieves comparable performance while being more robust and generalizable.

Crowdsourced 3D Mapping: A Combined Multi-View Geometry and Self-Supervised Learning Approach

This work proposes a framework that estimates the 3D positions of semantically meaningful landmarks such as traffic signs without assuming known camera intrinsics, using only monocular color camera and GPS.

Self-Calibration Supported Robust Projective Structure-from-Motion

A unified SfM method, in which the matching process is supported by self-calibration constraints and the idea that good matches should yield a valid calibration is used, in order to obtain robust matching from a set of putative correspondences.

Fusing the Old with the New: Learning Relative Camera Pose with Geometry-Guided Uncertainty

A novel framework that involves probabilistic fusion between the two families of predictions during network training, with a view to leveraging their complementary benefits in a learnable way for fusing classical geometry and deep learning.

Learning Monocular Visual Odometry via Self-Supervised Long-Term Modeling

This paper model the long-term dependency in pose prediction using a pose network that features a two-layer convolutional LSTM module, and proposes a stage-wise training mechanism, where the first stage operates in a local time window and the second stage refines the poses with a "global" loss given the firststage features.

Practical Auto-Calibration for Spatial Scene-Understanding from Crowdsourced Dashcamera Videos

The effectiveness of this proposed system for practical monocular onboard camera auto-calibration from crowdsourced videos on KITTI raw, Oxford RobotCar, and the crowdsourced D$^2$-City datasets in varying conditions is shown.

Image Stitching and Rectification for Hand-Held Cameras

A new differential homography that can account for the scanline-varying camera poses in Rolling Shutter (RS) cameras is derived, and its application to carry out RS-aware image stitching and rectification at one stroke is demonstrated.

LSTM and Filter Based Comparison Analysis for Indoor Global Localization in UAVs

Experimental results show that the proposed filter-based approach combined with a DL approach has promising performance in terms of accuracy and time efficiency in indoor localization of UAVs.

References

SHOWING 1-10 OF 46 REFERENCES

DeepCalib: a deep learning approach for automatic intrinsic calibration of wide field-of-view cameras

This work builds upon the recent developments in deep Convolutional Neural Networks (CNN) and automatically estimates the intrinsic parameters of the camera from a single input image, using the great amount of omnidirectional images available on the Internet to generate a large-scale dataset.

LSD-SLAM: Large-Scale Direct Monocular SLAM

A novel direct tracking method which operates on \(\mathfrak{sim}(3)\), thereby explicitly detecting scale-drift, and an elegant probabilistic solution to include the effect of noisy depth values into tracking are introduced.

FishEyeRecNet: A Multi-Context Collaborative Deep Network for Fisheye Image Rectification

This paper proposes an end-to-end multi-context collaborative deep network for removing distortions from single fisheye images and shows that the proposed model significantly outperforms current state of the art methods.

Challenges in Monocular Visual Odometry: Photometric Calibration, Motion Bias, and Rolling Shutter Effect

This work evaluates three very influential yet easily overlooked aspects of photometric calibration, motion bias, and rolling shutter effect quantitatively on the state of the art of direct, feature-based, and semi-direct methods, providing the community with useful practical knowledge both for better applying existing methods and developing new algorithms of VO and SLAM.

Baseline Desensitizing in Translation Averaging

A simple yet effective bilinear objective function, introducing a variable to perform the requisite normalization of the angular error, which achieves overall superior accuracies in benchmark dataset compared to state-of-the-art methods, and is also several times faster.

Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing

A deep convolutional neural network architecture to localize semantic parts in 2D image and 3D space while inferring their visibility states, given a single RGB image is presented.

Efficient Solution to the Epipolar Geometry for Radially Distorted Cameras

A new efficient solution to the estimation of the epipolar geometry of two cameras from image matches that uses 10 image correspondences is presented by manipulating ten input polynomial equations and efficiently found using the Sturm sequences method.

Learning Structure-And-Motion-Aware Rolling Shutter Correction

This paper first makes a theoretical contribution by showing that RS two-view geometry is degenerate in the case of pure translational camera motion, and proposes a Convolutional Neural Network (CNN)-based method which learns the underlying geometry from just a single RS image and performs RS image correction.

Deep Supervision with Intermediate Concepts

This work explores an approach for injecting prior domain structure into neural network training by supervising hidden layers of a CNN with intermediate concepts that normally are not observed in practice, and formulates a probabilistic framework which formalizes these notions and predicts improved generalization via this deep supervision method.

Direct Sparse Odometry

The experiments show that the presented approach significantly outperforms state-of-the-art direct and indirect methods in a variety of real-world settings, both in terms of tracking accuracy and robustness.