• Corpus ID: 221970697

Towards General Purpose and Geometry Preserving Single-View Depth Estimation

@article{Romanov2020TowardsGP,
  title={Towards General Purpose and Geometry Preserving Single-View Depth Estimation},
  author={Mikhail Romanov and Nikolay Patatkin and Anna Vorontsova and Anton Konushin},
  journal={ArXiv},
  year={2020},
  volume={abs/2009.12419}
}
Single-view depth estimation plays a crucial role in scene understanding for AR applications and 3D modelling as it allows to retrieve the geometry of a scene. However, it is only possible if the inverse depth estimates are unbiased, i.e. they are either absolute or Up-to-Scale (UTS). In recent years, great progress has been made in general-purpose single-view depth estimation. Nevertheless, the latest general-purpose models were trained using ranking or on Up-to-Shift-Scale (UTSS) data. As a… 

Figures and Tables from this paper

Boosting Light-Weight Depth Estimation Via Knowledge Distillation
TLDR
This paper first introduces a compact network that can estimate a depth map in real-time, and technically shows two complementary and necessary strategies to improve the performance of the light-weight network, which outperforms previous light- Weight methods in terms of inference accuracy, computational efficiency and generalization.

References

SHOWING 1-10 OF 38 REFERENCES
Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer
TLDR
This work proposes a robust training objective that is invariant to changes in depth range and scale, advocate the use of principled multi-objective learning to combine data from different sources, and highlights the importance of pretraining encoders on auxiliary tasks.
MegaDepth: Learning Single-View Depth Prediction from Internet Photos
  • Zhengqi Li, Noah Snavely
  • Computer Science
    2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • 2018
TLDR
This work proposes to use multi-View Internet photo collections, a virtually unlimited data source, to generate training data via modern structure-from-motion and multi-view stereo (MVS) methods, and presents a large depth dataset called MegaDepth based on this idea.
High Quality Monocular Depth Estimation via Transfer Learning
TLDR
A convolutional neural network for computing a high-resolution depth map given a single RGB image with the help of transfer learning, which outperforms state-of-the-art on two datasets and also produces qualitatively better results that capture object boundaries more faithfully.
Lightweight Monocular Depth Estimation Model by Joint End-to-End Filter Pruning
TLDR
A lightweight monocular depth model obtained from a large trained model is proposed by removing the least important features with a novel joint end-to-end filter pruning and it is shown that masking can improve accuracy over the baseline with fewer parameters, even without enforcing compression loss.
Deep Monocular Depth Estimation via Integration of Global and Local Predictions
TLDR
A deep variational model that effectively integrates heterogeneous predictions from two convolutional neural networks, named global and local networks, which have contrasting network architecture and are designed to capture the depth information with complementary attributes.
Deep Ordinal Regression Network for Monocular Depth Estimation
TLDR
The proposed deep ordinal regression network (DORN) achieves state-of-the-art results on three challenging benchmarks, i.e., KITTI, Make3D, and NYU Depth v2, and outperforms existing methods by a large margin.
Monocular Relative Depth Perception with Web Stereo Data Supervision
TLDR
A simple yet effective method to automatically generate dense relative depth annotations from web stereo images, and an improved ranking loss is introduced to deal with imbalanced ordinal relations, enforcing the network to focus on a set of hard pairs.
FastDepth: Fast Monocular Depth Estimation on Embedded Systems
TLDR
This paper proposes an efficient and lightweight encoder-decoder network architecture and applies network pruning to further reduce computational complexity and latency and demonstrates real-time monocular depth estimation using a deep neural network with the lowest latency and highest throughput on an embedded platform that can be carried by a micro aerial vehicle.
CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth
TLDR
A new type of convolution is proposed that can take the camera parameters into account, thus allowing neural networks to learn calibration-aware patterns, and improves the generalization capabilities of depth prediction networks considerably.
Light-Weight RefineNet for Real-Time Semantic Segmentation
TLDR
This work adapts a powerful semantic segmentation architecture, called RefineNet, into the more compact one, suitable even for tasks requiring real-time performance on high-resolution inputs, and proposes two modifications aimed to decrease the number of parameters and floating point operations.
...
...