Towards Scale Consistent Monocular Visual Odometry by Learning from the Virtual World
@article{Zhang2022TowardsSC, title={Towards Scale Consistent Monocular Visual Odometry by Learning from the Virtual World}, author={Sen Zhang and Jing Zhang and Dacheng Tao}, journal={2022 International Conference on Robotics and Automation (ICRA)}, year={2022}, pages={5601-5607} }
Monocular visual odometry (VO) has attracted extensive research attention by providing real-time vehicle motion from cost-effective camera images. However, state-of-the-art optimization-based monocular VO methods suffer from the scale inconsistency problem for long-term predictions. Deep learning has recently been introduced to address this issue by leveraging stereo sequences or ground-truth motions in the training dataset. However, it comes at an additional cost for data collection, and such…
5 Citations
Towards Scale-Aware, Robust, and Generalizable Unsupervised Monocular Depth Estimation by Integrating IMU Motion Dynamics
- Computer ScienceECCV
- 2022
By leveraging IMU during training, DynaDepth not only learns an absolute scale, but also provides a better generalization ability and robustness against vision degradation such as illumination change and moving objects.
Information-Theoretic Odometry Learning
- Computer ScienceInternational Journal of Computer Vision
- 2022
This paper bound the generalization errors of the deep information bottleneck framework and the predictability of the latent representation of the stochastic latent representation to provide not only a performance guarantee but also practical guidance for model design, sample collection, and sensor selection.
JPerceiver: Joint Perception Network for Depth, Pose and Layout Estimation in Driving Scenes
- Computer ScienceECCV
- 2022
A novel joint perception framework named JPerceiver is proposed, which can simultaneously estimate scale-aware depth and VO as well as BEV layout from a monocular video sequence based on a carefully-designed scale loss.
Towards Accurate Ground Plane Normal Estimation from Ego-Motion
- Computer ScienceSensors
- 2022
A novel approach for ground plane normal estimation of wheeled vehicles that fully utilizes the underlying connection between the ego pose odometry (ego-motion) and its nearby ground plane and achieves state-of-the-art accuracy on KITTI dataset with the estimated vector error of 0.39°.
SIR: Self-Supervised Image Rectification via Seeing the Same Scene From Multiple Different Lenses
- Computer ScienceIEEE Transactions on Image Processing
- 2023
A novel self-supervised image rectification (SIR) method based on an important insight that the rectified results of distorted images of a same scene from different lenses should be the same, with comparable or even better performance than the supervised baseline method and representative state-of-the-art (SOTA) methods.
References
SHOWING 1-10 OF 40 REFERENCES
Enhancing Self-Supervised Monocular Depth Estimation with Traditional Visual Odometry
- Computer Science2019 International Conference on 3D Vision (3DV)
- 2019
This paper enables to further improve monocular depth estimation by integrating into existing self-supervised networks a geometrical prior, and proposes a sparsity-invariant autoencoder able to process the output of conventional visual odometry algorithms working in synergy with depth-from-mono networks.
Generalizing to the Open World: Deep Visual Odometry with Online Adaptation
- Computer Science2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2021
This paper proposes an online adaptation framework for deep VO with the assistance of scene-agnostic geometric computations and Bayesian inference that enables fast adaptation of deep VO networks to unseen environments in a self-supervised manner.
Unsupervised Scale-consistent Depth and Ego-motion Learning from Monocular Video
- Computer ScienceNeurIPS
- 2019
This paper proposes a geometry consistency loss for scale-consistent predictions and an induced self-discovered mask for handling moving objects and occlusions and is the first work to show that deep networks trained using unlabelled monocular videos can predict globally scale- Consistent camera trajectories over a long video sequence.
Visual Odometry Revisited: What Should Be Learnt?
- Computer Science2020 IEEE International Conference on Robotics and Automation (ICRA)
- 2020
This work revisit the basics of VO and explore the right way for integrating deep learning with epipolar geometry and Perspective-n-Point method and design a simple but robust frame-to-frame VO algorithm (DF-VO) which outperforms pure deep learning-based and geometry-based methods.
Learning Monocular Depth Estimation Infusing Traditional Stereo Knowledge
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
This paper proposes monoResMatch, a novel deep architecture designed to infer depth from a single input image by synthesizing features from a different point of view, horizontally aligned with the input image, performing stereo matching between the two cues and shows how obtaining proxy ground truth annotation through traditional stereo algorithms enables more accurate monocular depth estimation.
Towards Better Generalization: Joint Depth-Pose Learning Without PoseNet
- Computer Science2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2020
A novel system that explicitly disentangles scale from the network estimation, which achieves state-of-the-art results among self-supervised learning-based methods on KITTI Odometry and NYUv2 dataset and presents some interesting findings on the limitation of PoseNet-based relative pose estimation methods in terms of generalization ability.
Self-Supervised Deep Visual Odometry With Online Adaptation
- Computer Science2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2020
An online meta-learning algorithm is proposed to enable VO networks to continuously adapt to new environments in a self-supervised manner and utilizes convolutional long short-term memory (convLSTM) to aggregate rich spatial-temporal information in the past.
Deep Online Correction for Monocular Visual Odometry
- Computer Science2021 IEEE International Conference on Robotics and Automation (ICRA)
- 2021
Though without complex back-end optimization modules, the proposed deep online correction framework achieves outstanding performance with relative transform error (RTE) = 2.0% on KITTI Odometry benchmark for Seq.
Real-Time Monocular Depth Estimation Using Synthetic Data with Domain Adaptation via Image Style Transfer
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
This work takes advantage of style transfer and adversarial training to predict pixel perfect depth from a single real-world color image based on training over a large corpus of synthetic environment data.
Digging Into Self-Supervised Monocular Depth Estimation
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
It is shown that a surprisingly simple model, and associated design choices, lead to superior predictions, and together result in both quantitatively and qualitatively improved depth maps compared to competing self-supervised methods.