Self-Supervised Monocular Image Depth Learning and Confidence Estimation

@article{Chen2020SelfSupervisedMI,
  title={Self-Supervised Monocular Image Depth Learning and Confidence Estimation},
  author={Long Chen and Wen Tang and Nigel W. John},
  journal={Neurocomputing},
  year={2020},
  volume={381},
  pages={272-281}
}

Figures and Tables from this paper

On the Uncertainty of Self-Supervised Monocular Depth Estimation
TLDR
This work explores for the first time how to estimate the uncertainty for this task and how this affects depth accuracy, and proposes a novel peculiar technique specifically designed for self-supervised approaches.
Superb Monocular Depth Estimation Based on Transfer Learning and Surface Normal Guidance
TLDR
A novel monocular depth estimation method was proposed that primarily utilizes a lighter-weight Convolutional Neural Network structure for coarse depth prediction and then refines the coarse depth images by combining surface normal guidance.
Unsupervised Depth and Confidence Prediction from Monocular Images using Bayesian Inference
TLDR
An unsupervised deep learning framework with Bayesian inference for improving the accuracy of per-pixel depth prediction from monocular RGB images and is shown to outperform the existing state-of-the-art methods for depth prediction on the publicly available KITTI outdoor dataset.
Loop-Net: Joint Unsupervised Disparity and Optical Flow Estimation of Stereo Videos With Spatiotemporal Loop Consistency
TLDR
This letter proposes a joint framework that estimates disparity and optical flow of stereo videos and generalizes across various video frames by considering the spatiotemporal relation between the disparity and flow without supervision and introduces a video-based training scheme using the c-LSTM to reinforce the temporal consistency.
Recovering dense 3D point clouds from single endoscopic image
On the Road With 16 Neurons: Towards Interpretable and Manipulable Latent Representations for Visual Predictions in Driving Scenarios
TLDR
A strategy for visual perception in the context of autonomous driving is proposed that uses compact representations that use as few as 16 neural units for each of the two basic driving concepts the authors consider: cars and lanes.
Autonomous quadrotor obstacle avoidance based on dueling double deep recurrent Q network with monocular vision
TLDR
This framework enables the quadrotor to realize autonomous obstacle avoidance without any prior environment information or labeled datasets, and uses dueling double deep recurrent Q networks to eliminate the negative effects of limited observation capacity of on-board monocular camera.
Automated joining element design by predicting spot-weld locations using 3D convolutional neural networks
TLDR
This work proposes a novel methodology to predict joining element locations using machine learning, and describes two approaches to predict specifically spot-weld locations using voxels as data representation.
...
1
2
...

References

SHOWING 1-10 OF 62 REFERENCES
Deeper Depth Prediction with Fully Convolutional Residual Networks
TLDR
A fully convolutional architecture, encompassing residual learning, to model the ambiguous mapping between monocular images and depth maps is proposed and a novel way to efficiently learn feature map up-sampling within the network is presented.
Depth estimation with convolutional conditional random field network
Unsupervised Monocular Depth Estimation with Left-Right Consistency
TLDR
This paper proposes a novel training objective that enables the convolutional neural network to learn to perform single image depth estimation, despite the absence of ground truth depth data, and produces state of the art results for monocular depth estimation on the KITTI driving dataset.
Multi-scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation
TLDR
This paper addresses the problem of depth estimation from a single still image by designing a novel CNN implementation of mean-field updates for continuous CRFs and demonstrates the effectiveness of the proposed approach and establishes new state of the art results on publicly available datasets.
Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue
TLDR
This work proposes a unsupervised framework to learn a deep convolutional neural network for single view depth prediction, without requiring a pre-training stage or annotated ground-truth depths, and shows that this network trained on less than half of the KITTI dataset gives comparable performance to that of the state-of-the-art supervised methods for singleView depth estimation.
Joint Semantic Segmentation and Depth Estimation with Deep Convolutional Networks
TLDR
This work presents a new model for simultaneous depth estimation and semantic segmentation from a single RGB image and couple the deep CNN with fully connected CRF, which captures the contextual relationships and interactions between the semantic and depth cues improving the accuracy of the final results.
Estimating Depth From Monocular Images as Classification Using Deep Fully Convolutional Residual Networks
TLDR
By performing depth classification instead of regression, this paper can easily obtain the confidence of a depth prediction in the form of probability distribution and apply an information gain loss to make use of the predictions that are close to ground-truth during training, as well as fully-connected conditional random fields for post-processing to further improve the performance.
Semi-Supervised Deep Learning for Monocular Depth Map Prediction
TLDR
This paper proposes a novel approach to depth map prediction from monocular images that learns in a semi-supervised way and uses sparse ground-truth depth for supervised learning, and also enforces the deep network to produce photoconsistent dense depth maps in a stereo setup using a direct image alignment loss.
A Two-Streamed Network for Estimating Fine-Scaled Depth Maps from Single RGB Images
TLDR
A fast-to-train two-streamed CNN that predicts depth and depth gradients, which are then fused together into an accurate and detailed depth map, and defines a novel set loss over multiple images.
Learning Depth from Monocular Videos Using Direct Methods
TLDR
It is argued that the depth CNN predictor can be learned without a pose CNN predictor and demonstrated empirically that incorporation of a differentiable implementation of DVO - along with a novel depth normalization strategy - substantially improves performance over state of the art that use monocular videos for training.
...
1
2
3
4
5
...