Deeper Depth Prediction with Fully Convolutional Residual Networks

  title={Deeper Depth Prediction with Fully Convolutional Residual Networks},
  author={Iro Laina and C. Rupprecht and Vasileios Belagiannis and Federico Tombari and Nassir Navab},
  journal={2016 Fourth International Conference on 3D Vision (3DV)},
This paper addresses the problem of estimating the depth map of a scene given a single RGB image. [] Key Method For optimization, we introduce the reverse Huber loss that is particularly suited for the task at hand and driven by the value distributions commonly present in depth maps. Our model is composed of a single architecture that is trained end-to-end and does not rely on post-processing techniques, such as CRFs or other additional refinement steps. As a result, it runs in real-time on images or videos…

Figures and Tables from this paper

Depth Estimation From a Single Image Using Guided Deep Network
A novel and simple method is proposed by exploiting the latent space of the depth-to-depth network, which contains useful encoded features for guiding the process of depth generation, to greatly enhance local details even under complicated background regions.
An Adaptive Unsupervised Learning Framework for Monocular Depth Estimation
An adaptive loss function to tackle the regions which are non-overlapping between consecutive images is presented and a novel depth smoothness loss is proposed to improve the accuracy of the model.
Depth Estimation from Single Image Using CNN-Residual Network
This project does transfer learning by replacing the fully connected layer of ResNet-50 with upsampling blocks to recover the size of depth map, and demonstrates that the method of doing up-sampling by CNN-Residual network yields better result than fullyconnected layer, because it avoids overfitting.
Deep Monocular Depth Estimation via Integration of Global and Local Predictions
A deep variational model that effectively integrates heterogeneous predictions from two convolutional neural networks, named global and local networks, which have contrasting network architecture and are designed to capture the depth information with complementary attributes.
Depth Completion with Morphological Operations: An Intermediate Approach to Enhance Monocular Depth Estimation
  • R. Q. Mendes, E. G. Ribeiro, N. Rosa, V. Grassi
  • Computer Science
    2020 Latin American Robotics Symposium (LARS), 2020 Brazilian Symposium on Robotics (SBR) and 2020 Workshop on Robotics in Education (WRE)
  • 2020
This work addresses the SIDE and depth completion tasks jointly, focusing on the design of a lightweight method to be applied in real self-driving scenarios, and introduces a fast and efficient densification algorithm, based on closing morphology, and a deep network pipeline that uses the densified reference depth maps for training.
Depth Completion via Deep Basis Fitting
The proposed method replaces the final 1 × 1 convolutional layer employed in most depth completion networks with a least squares fitting module which computes weights by fitting the implicit depth bases to the given sparse depth measurements.
High Quality Monocular Depth Estimation via Transfer Learning
A convolutional neural network for computing a high-resolution depth map given a single RGB image with the help of transfer learning, which outperforms state-of-the-art on two datasets and also produces qualitatively better results that capture object boundaries more faithfully.
Smaller Residual Network for Single Image Depth Estimation
A new framework for estimating depth infor- mation from a single image by employing a two-stage architecture: a residual network and a simple decoder network that also compute loss based on gradient-direction, and their structure similarity.
Estimating Depth From Monocular Images as Classification Using Deep Fully Convolutional Residual Networks
By performing depth classification instead of regression, this paper can easily obtain the confidence of a depth prediction in the form of probability distribution and apply an information gain loss to make use of the predictions that are close to ground-truth during training, as well as fully-connected conditional random fields for post-processing to further improve the performance.


Deep convolutional neural fields for depth estimation from a single image
A deep structured learning scheme which learns the unary and pairwise potentials of continuous CRF in a unified deep CNN framework and can be used for depth estimations of general scenes with no geometric priors nor any extra information injected.
Depth Map Prediction from a Single Image using a Multi-Scale Deep Network
This paper employs two deep network stacks: one that makes a coarse global prediction based on the entire image, and another that refines this prediction locally, and applies a scale-invariant error to help measure depth relations rather than scale.
Learning Depth from Single Monocular Images
This work begins by collecting a training set of monocular images (of unstructured outdoor environments which include forests, trees, buildings, etc.) and their corresponding ground-truth depthmaps, and applies supervised learning to predict the depthmap as a function of the image.
Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs
This paper tackles this challenging and essentially underdetermined problem by regression on deep convolutional neural network (DCNN) features, combined with a post-processing refining step using conditional random fields (CRF).
Monocular Depth Estimation Using Neural Regression Forest
  • Anirban Roy, S. Todorovic
  • Computer Science
    2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2016
This paper presents a novel deep architecture, called neural regression forest (NRF), for depth estimation from a single image. NRF combines random forests and convolutional neural networks (CNNs).
Very Deep Convolutional Networks for Large-Scale Image Recognition
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Towards unified depth and semantic prediction from a single image
This work proposes a unified framework for joint depth and semantic prediction that effectively leverages the advantages of both tasks and provides the state-of-the-art results.
Discrete-Continuous Depth Estimation from a Single Image
This paper forms monocular depth estimation as a discrete-continuous optimization problem, where the continuous variables encode the depth of the superpixels in the input image, and the discrete ones represent relationships between neighboring superPixels.
Robust Optimization for Deep Regression
A regression model with ConvNets is proposed that achieves robustness to outlier levels by minimizing Tukey's biweight function, an M-estimator robust to outliers, as the loss function for the ConvNet.
Single image depth estimation from predicted semantic labels
This work first performs a semantic segmentation of the scene and uses the semantic labels to guide the 3D reconstruction and incorporates semantic features to achieve state-of-the-art results with a significantly simpler model than previous works.