Sparse-to-Continuous: Enhancing Monocular Depth Estimation using Occupancy Maps

  title={Sparse-to-Continuous: Enhancing Monocular Depth Estimation using Occupancy Maps},
  author={N{\'i}colas dos Santos Rosa and Vitor Campanholo Guizilini and Valdir Grassi},
  journal={2019 19th International Conference on Advanced Robotics (ICAR)},
This paper addresses the problem of single image depth estimation (SIDE), focusing on improving the quality of deep neural network predictions. In a supervised learning scenario, the quality of predictions is intrinsically related to the training labels, which guide the optimization process. For indoor scenes, structured-light-based depth sensors (e.g. Kinect) are able to provide dense, albeit short-range, depth maps. On the other hand, for outdoor scenes, LiDARs are considered the standard… 

Figures and Tables from this paper

Depth Completion with Morphological Operations: An Intermediate Approach to Enhance Monocular Depth Estimation

  • R. Q. MendesE. G. RibeiroN. RosaV. Grassi
  • Computer Science
    2020 Latin American Robotics Symposium (LARS), 2020 Brazilian Symposium on Robotics (SBR) and 2020 Workshop on Robotics in Education (WRE)
  • 2020
This work addresses the SIDE and depth completion tasks jointly, focusing on the design of a lightweight method to be applied in real self-driving scenarios, and introduces a fast and efficient densification algorithm, based on closing morphology, and a deep network pipeline that uses the densified reference depth maps for training.

Deep Learning-Based Monocular Depth Estimation Methods—A State-of-the-Art Review

A comprehensive overview of monocular depth estimation from Red-Green-Blue images including the problem representation and a short description of traditional methods for depth estimation is provided.

Monocular Depth Estimation Using Deep Learning: A Review

This paper tries to highlight the critical points of the state-of-the-art works on MDE from disparate aspects, including input data shapes and training manners such as supervised, semi-supervised, and unsupervised learning approaches in combination with applying different datasets and evaluation indicators.

PhoneDepth: A Dataset for Monocular Depth Estimation on Mobile Devices

PhoneDepth, a novel dataset that aims to take advantage of modern phones hardware and professional stereo cameras, is introduced and proves its high value by training neural networks with multiple depth supervision, fine-tuning on other datasets and for depth refinement.

When the Sun Goes Down: Repairing Photometric Losses for All-Day Depth Estimation

This paper shows how to use a combination of three techniques to allow the existing photometric losses to work for both day and nighttime images, and introduces a per-pixel neural intensity transformation to compensate for the light changes that occur between successive frames.

Focal-WNet: An Architecture Unifying Convolution and Attention for Depth Estimation

This paper proposes a novel architecture named Focal-WNet, which consists of two separate encoders and a single decoder, and incorporates focal self-attention instead of vanilla self-Attention to reduce the computational complexity of the network.

Towards Real-Time Monocular Depth Estimation for Robotics: A Survey

A comprehensive survey of MDE covering various methods is provided, the popular performance evaluation metrics and summarize publically available datasets are introduced and some promising directions for future research are presented.



Semi-Supervised Deep Learning for Monocular Depth Map Prediction

This paper proposes a novel approach to depth map prediction from monocular images that learns in a semi-supervised way and uses sparse ground-truth depth for supervised learning, and also enforces the deep network to produce photoconsistent dense depth maps in a stereo setup using a direct image alignment loss.

Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image

  • Fangchang MaS. Karaman
  • Computer Science
    2018 IEEE International Conference on Robotics and Automation (ICRA)
  • 2018
The use of a single deep regression network to learn directly from the RGB-D raw data is proposed, and the impact of number of depth samples on prediction accuracy is explored, to attain a higher level of robustness and accuracy.

Deeper Depth Prediction with Fully Convolutional Residual Networks

A fully convolutional architecture, encompassing residual learning, to model the ambiguous mapping between monocular images and depth maps is proposed and a novel way to efficiently learn feature map up-sampling within the network is presented.

Just-in-Time Reconstruction: Inpainting Sparse Maps Using Single View Depth Predictors as Priors

This work adopts a fairly standard approach to data fusion, to produce a fused depth map by performing inference over a novel fully-connected Conditional Random Field (CRF) which is parameterized by the input depth maps and their pixel-wise confidence weights.

Discrete-Continuous Depth Estimation from a Single Image

This paper forms monocular depth estimation as a discrete-continuous optimization problem, where the continuous variables encode the depth of the superpixels in the input image, and the discrete ones represent relationships between neighboring superPixels.

Learning Depth from Single Monocular Images

This work begins by collecting a training set of monocular images (of unstructured outdoor environments which include forests, trees, buildings, etc.) and their corresponding ground-truth depthmaps, and applies supervised learning to predict the depthmap as a function of the image.

From Big to Small: Multi-Scale Local Planar Guidance for Monocular Depth Estimation

This paper proposes a network architecture that utilizes novel local planar guidance layers located at multiple stages in the decoding phase that outperforms the state-of-the-art works with significant margin evaluating on challenging benchmarks.

Deep Depth Completion of a Single RGB-D Image

A deep network is trained that takes an RGB image as input and predicts dense surface normals and occlusion boundaries, then combined with raw depth observations provided by the RGB-D camera to solve for depths for all pixels, including those missing in the original observation.

Deep Ordinal Regression Network for Monocular Depth Estimation

The proposed deep ordinal regression network (DORN) achieves state-of-the-art results on three challenging benchmarks, i.e., KITTI, Make3D, and NYU Depth v2, and outperforms existing methods by a large margin.

Deep convolutional neural fields for depth estimation from a single image

A deep structured learning scheme which learns the unary and pairwise potentials of continuous CRF in a unified deep CNN framework and can be used for depth estimations of general scenes with no geometric priors nor any extra information injected.