• Corpus ID: 246035934

Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth

@article{Kim2022GlobalLocalPN,
  title={Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth},
  author={Doyeon Kim and Woon-Seng Ga and Pyungwhan Ahn and Donggyu Joo and Se Young Chun and Junmo Kim},
  journal={ArXiv},
  year={2022},
  volume={abs/2201.07436}
}
Depth estimation from a single image is an important task that can be applied to various fields in computer vision, and has grown rapidly with the development of convolutional neural networks. In this paper, we propose a novel structure and training strategy for monocular depth estimation to further improve the prediction accuracy of the network. We deploy a hierarchical transformer encoder to capture and convey the global context, and design a lightweight yet powerful decoder to generate an… 

GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation

TLDR
The GFI-Net is proposed, which aims to utilize geometric features, such as object locations and vanishing points, on a global scale and improves the performance of the depth estimation by mitigating information reduction and amplifying global interaction representations.

Multilevel Pyramid Network for Monocular Depth Estimation Based on Feature Refinement and Adaptive Fusion

TLDR
The proposed multilevel pyramid network for monocular depth estimation based on feature refinement and adaptive fusion can recover reasonable depth outputs with better details and outperform several depth recovery algorithms from a qualitative and quantitative perspective.

LiteDepth: Digging into Fast and Accurate Depth Estimation on Mobile Devices

TLDR
This paper develops an end-to-end learning-based model with a tiny weight size and a short inference time, and proposes a simple yet effective data augmentation strategy, called R 2 crop, to boost the model performance.

Revealing the Dark Secrets of Masked Image Modeling

TLDR
This paper compares MIM with the long-dominant supervised pre-trained models from two perspectives, the visualizations and the experiments, to uncover their key representational differences and finds that MIM models can perform significantly better on geometric and motion tasks with weak semantics or fine-grained classi-cation tasks, than their supervised counterparts.

CARLA-GeAR: a Dataset Generator for a Systematic Evaluation of Adversarial Robustness of Vision Models

TLDR
CLA-GEAR is presented, a tool for the automatic generation of photo-realistic synthetic datasets that can be used for a systematic evaluation of the adversarial robustness of neural models against physical adversarial patches, as well as for comparing the performance of different adversarial defense/detection methods.

Image-Based Obstacle Detection Methods for the Safe Navigation of Unmanned Vehicles: A Review

TLDR
It is observed that despite significant progress, deep learning techniques also face difficulties in complex and unknown environments where objects of varying types and shapes are present.

LocalBins: Improving Depth Estimation by Learning Local Distributions

TLDR
A novel architecture for depth estimation from a single image based on the popular encoderdecoder architecture that is frequently used as a starting point for all dense regression tasks and evolves in two ways that predict depth distributions of local neighborhoods at every pixel.

References

SHOWING 1-10 OF 28 REFERENCES

From Big to Small: Multi-Scale Local Planar Guidance for Monocular Depth Estimation

TLDR
This paper proposes a network architecture that utilizes novel local planar guidance layers located at multiple stages in the decoding phase that outperforms the state-of-the-art works with significant margin evaluating on challenging benchmarks.

Structure-Aware Residual Pyramid Network for Monocular Depth Estimation

TLDR
A Structure-Aware Residual Pyramid Network (SARPN) to exploit multi-scale structures for accurate depth prediction and an Adaptive Dense Feature Fusion (ADFF) module, which adaptively fuses effective features from all scales for inferring structures of each scale, is introduced.

Deep Ordinal Regression Network for Monocular Depth Estimation

TLDR
The proposed deep ordinal regression network (DORN) achieves state-of-the-art results on three challenging benchmarks, i.e., KITTI, Make3D, and NYU Depth v2, and outperforms existing methods by a large margin.

Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue

TLDR
This work proposes a unsupervised framework to learn a deep convolutional neural network for single view depth prediction, without requiring a pre-training stage or annotated ground-truth depths, and shows that this network trained on less than half of the KITTI dataset gives comparable performance to that of the state-of-the-art supervised methods for singleView depth estimation.

AdaBins: Depth Estimation Using Adaptive Bins

TLDR
A transformer-based architecture block that divides the depth range into bins whose center value is estimated adaptively per image, and which shows a decisive improvement over the state-of-the-art on several popular depth datasets across all metrics.

ACED: Accurate And Edge-Consistent Monocular Depth Estimation

TLDR
For the first time, a fully differentiable ordinal regression is formulated and train the network in end-to-end fashion, leading to smooth and edge-consistent depth maps in single image depth estimation.

Enforcing Geometric Constraints of Virtual Normal for Depth Prediction

TLDR
This work shows the importance of the high-order 3D geometric constraints for depth prediction by designing a loss term that enforces one simple type of geometric constraints, namely, virtual normal directions determined by randomly sampled three points in the reconstructed 3D space, to considerably improve the depth prediction accuracy.

Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Analysis and a New Strategy

TLDR
CutBlur is proposed that cuts a low-resolution patch and pastes it to the corresponding high-resolution image region and vice versa and consistently and significantly improves the performance across various scenarios, especially when the model size is big and the data is collected under real-world environments.

SharpNet: Fast and Accurate Recovery of Occluding Contours in Monocular Depth Estimation

TLDR
SharpNet is introduced, a method that predicts an accurate depth map given a single input color image, with a particular attention to the reconstruction of occluding contours, which is actually better than the "ground truth" acquired by a depth camera based on structured light.

CutDepth: Edge-aware Data Augmentation in Depth Estimation

TLDR
Experiments objectively and subjectively show that the proposed method, called CutDepth, outperforms conventional methods of data augmentation in monocular depth estimation, and the estimation accuracy is improved even though there are few training data at long distances.