• Corpus ID: 246035934

Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth

@article{Kim2022GlobalLocalPN,
  title={Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth},
  author={Doyeon Kim and Woon-Seng Ga and Pyungwhan Ahn and Donggyu Joo and Se Young Chun and Junmo Kim},
  journal={ArXiv},
  year={2022},
  volume={abs/2201.07436}
}
Depth estimation from a single image is an important task that can be applied to various fields in computer vision, and has grown rapidly with the development of convolutional neural networks. In this paper, we propose a novel structure and training strategy for monocular depth estimation to further improve the prediction accuracy of the network. We deploy a hierarchical transformer encoder to capture and convey the global context, and design a lightweight yet powerful decoder to generate an… 

GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation

The GFI-Net is proposed, which aims to utilize geometric features, such as object locations and vanishing points, on a global scale and improves the performance of the depth estimation by mitigating information reduction and amplifying global interaction representations.

Multilevel Pyramid Network for Monocular Depth Estimation Based on Feature Refinement and Adaptive Fusion

The proposed multilevel pyramid network for monocular depth estimation based on feature refinement and adaptive fusion can recover reasonable depth outputs with better details and outperform several depth recovery algorithms from a qualitative and quantitative perspective.

LocalBins: Improving Depth Estimation by Learning Local Distributions

This work proposes a novel architecture for depth estimation from a single image based on the popular encoder-decoder architecture that is frequently used as a starting point for all dense regression tasks and evolves the architecture in two ways.

LightDepth: A Resource Efficient Depth Estimation Approach for Dealing with Ground Truth Sparsity via Curriculum Learning

This work presents a “fast” and “battery-efficient” approach for depth estimation that devises model-agnostic curriculum-based learning fordepth estimation and shows that the accuracy of the model performs on par with the state-of-the-art models.

LiteDepth: Digging into Fast and Accurate Depth Estimation on Mobile Devices

This paper develops an end-to-end learning-based model with a tiny weight size and a short inference time, and proposes a simple yet effective data augmentation strategy, called R 2 crop, to boost the model performance.

Monocular Fisheye Depth Estimation for Automated Valet Parking: Dataset, Baseline and Deep Optimizers

  • Zizhang WuZhi-Gang Fan Zhengbo Luo
  • Environmental Science, Computer Science
    2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC)
  • 2022
This work constructs a system for monocular fisheye depth estimation using an enhanced deep optimizer for improving the results of supervised monocular depth estimation on f isheye camera images and evaluates the optimizer-based method on the KITTI dataset and achieved state-of-the-art results.

SeasonDepth: Cross-Season Monocular Depth Prediction Dataset and Benchmark under Multiple Environments

This work derives a new cross-season scaleless monocular depth prediction dataset SeasonDepth from CMU Visual Localization dataset through structure from motion and formulate several metrics to benchmark the performance under different environments using recent state-of-the-art open-source depth prediction pretrained models from KITTI benchmark.

Revealing the Dark Secrets of Masked Image Modeling

This paper compares MIM with the long-dominant supervised pre-trained models from two perspectives, the visualizations and the experiments, to uncover their key representational differences and finds that MIM models can perform significantly better on geometric and motion tasks with weak semantics or fine-grained classi-cation tasks, than their supervised counterparts.

CARLA-GeAR: a Dataset Generator for a Systematic Evaluation of Adversarial Robustness of Vision Models

CLA-GEAR is presented, a tool for the automatic generation of photo-realistic synthetic datasets that can be used for a systematic evaluation of the adversarial robustness of neural models against physical adversarial patches, as well as for comparing the performance of different adversarial defense/detection methods.

Image-Based Obstacle Detection Methods for the Safe Navigation of Unmanned Vehicles: A Review

It is observed that despite significant progress, deep learning techniques also face difficulties in complex and unknown environments where objects of varying types and shapes are present.

References

SHOWING 1-10 OF 28 REFERENCES

From Big to Small: Multi-Scale Local Planar Guidance for Monocular Depth Estimation

This paper proposes a network architecture that utilizes novel local planar guidance layers located at multiple stages in the decoding phase that outperforms the state-of-the-art works with significant margin evaluating on challenging benchmarks.

Structure-Aware Residual Pyramid Network for Monocular Depth Estimation

A Structure-Aware Residual Pyramid Network (SARPN) to exploit multi-scale structures for accurate depth prediction and an Adaptive Dense Feature Fusion (ADFF) module, which adaptively fuses effective features from all scales for inferring structures of each scale, is introduced.

Leveraging Contextual Information for Monocular Depth Estimation

A novel network architecture is proposed to improve the performance by leveraging the contextual information for monocular depth estimation by introducing a depth prediction network with the proposed attentive skip connection and a global context module.

Deep Ordinal Regression Network for Monocular Depth Estimation

The proposed deep ordinal regression network (DORN) achieves state-of-the-art results on three challenging benchmarks, i.e., KITTI, Make3D, and NYU Depth v2, and outperforms existing methods by a large margin.

Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue

This work proposes a unsupervised framework to learn a deep convolutional neural network for single view depth prediction, without requiring a pre-training stage or annotated ground-truth depths, and shows that this network trained on less than half of the KITTI dataset gives comparable performance to that of the state-of-the-art supervised methods for singleView depth estimation.

ACED: Accurate And Edge-Consistent Monocular Depth Estimation

For the first time, a fully differentiable ordinal regression is formulated and train the network in end-to-end fashion, leading to smooth and edge-consistent depth maps in single image depth estimation.

AdaBins: Depth Estimation Using Adaptive Bins

A transformer-based architecture block that divides the depth range into bins whose center value is estimated adaptively per image, and which shows a decisive improvement over the state-of-the-art on several popular depth datasets across all metrics.

Enforcing Geometric Constraints of Virtual Normal for Depth Prediction

This work shows the importance of the high-order 3D geometric constraints for depth prediction by designing a loss term that enforces one simple type of geometric constraints, namely, virtual normal directions determined by randomly sampled three points in the reconstructed 3D space, to considerably improve the depth prediction accuracy.

SharpNet: Fast and Accurate Recovery of Occluding Contours in Monocular Depth Estimation

SharpNet is introduced, a method that predicts an accurate depth map given a single input color image, with a particular attention to the reconstruction of occluding contours, which is actually better than the "ground truth" acquired by a depth camera based on structured light.

Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Analysis and a New Strategy

CutBlur is proposed that cuts a low-resolution patch and pastes it to the corresponding high-resolution image region and vice versa and consistently and significantly improves the performance across various scenarios, especially when the model size is big and the data is collected under real-world environments.