• Corpus ID: 24957590

Deep Depth Inference using Binocular and Monocular Cues

  title={Deep Depth Inference using Binocular and Monocular Cues},
  author={Xinqing Guo and Z. Chen and Siyuan Li and Yang Yang and Jingyi Yu},
  journal={arXiv: Computer Vision and Pattern Recognition},
Human visual system relies on both binocular stereo cues and monocular focusness cues to gain effective 3D perception. In computer vision, the two problems are traditionally solved in separate tracks. In this paper, we present a unified learning-based technique that simultaneously uses both types of cues for depth inference. Specifically, we use a pair of focal stacks as input to emulate human perception. We first construct a comprehensive focal stack training dataset synthesized by depth… 
MonSter: Awakening the Mono in Stereo
This work proposes a two-camera system, in which the cameras are used jointly to extract a stereo depth and individually to provide a monocular depth from each camera, which leads to more accurate depth estimation and a novel online self-calibration strategy.
Depth Estimation From a Light Field Image Pair With a Generative Model
A novel method to estimate the disparity maps from a light field image pair captured by a pair of light field cameras that integrates two types of critical depth cues, which are separately inferred from the epipolar plane images and binocular stereo vision into a global solution.
Binocular Light-Field: Imaging Theory and Occlusion-Robust Depth Perception Application
An accurate occlusion-robust depth estimation algorithm is proposed by exploiting multi-baseline stereo matching cues and defocus cues to eliminate the matching ambiguities and outliers in binocular SV and LF imaging.
Deep Light-field-driven Saliency Detection from a Single View
This paper proposes a high-quality light field synthesis network to produce reliable 4D light field information and proposes a novel light-fielddriven saliency detection network with two purposes, that is, richer saliency features can be produced and geometric information can be considered for integration of multi-view saliency maps in a view-wise attention fashion.
Memory-oriented Decoder for Light Field Salient Object Detection
A deep-learning-based method where a novel memory-oriented decoder is tailored for light field saliency detection and deeply explore and comprehensively exploit internal correlation of focal slices for accurate prediction by designing feature fusion and integration mechanisms.
LFNet: Light Field Fusion Network for Salient Object Detection
A novel light field fusion network-LFNet, a CNNs-based light field saliency model using 4D light field data containing abundant spatial and contextual information is proposed, which can reliably locate and identify salient objects even in a complex scene.


Depth from Combining Defocus and Correspondence Using Light-Field Cameras
A novel simple and principled algorithm is presented that computes dense depth estimation by combining both defocus and correspondence depth cues, and shows how to combine the two cues into a high quality depth map, suitable for computer vision applications such as matting, full control of depth-of-field, and surface reconstruction.
Blur and Disparity Are Complementary Cues to Depth
A Deep Visual Correspondence Embedding Model for Stereo Matching Costs
A novel deep visual correspondence embedding model is trained via Convolutional Neural Network on a large set of stereo images with ground truth disparities, and it is proved that the new measure of pixel dissimilarity outperforms traditional matching costs.
Depth from Semi-Calibrated Stereo and Defocus
A novel approach to better estimate the disparity map of the main camera, using both the calibrated cameras at once to generate a high-resolution color image with a complete depth map, without sacrificing resolution and with minimal auxiliary hardware is proposed.
Efficient Deep Learning for Stereo Matching
This paper proposes a matching network which is able to produce very accurate results in less than a second of GPU computation, and exploits a product layer which simply computes the inner product between the two representations of a siamese architecture.
Fusing Depth from Defocus and Stereo with Coded Apertures
This paper gives the proof of the proportional relationship between the diameter of defocus and disparity which makes the calibration easy and shows the outstanding performance of the novel depth measurement method which has both advantages of two depth cues through simulation and actual experiments.
Globally consistent depth labeling of 4D light fields
We present a novel paradigm to deal with depth reconstruction from 4D light fields in a variational framework. Taking into account the special structure of light field data, we reformulate the
Patch Based Confidence Prediction for Dense Disparity Map
A novel method to predict the correctness of stereo correspondences, which is called confidence, and a confidence fusion method for dense disparity estimation which is incorporated into Semi-Global Matching(SGM) by adjusting its parameters directly.
Look Wider to Match Image Patches With Convolutional Neural Networks
A novel convolutional neural network module to learn a stereo matching cost with a large-sized window that can successfully utilize the information from a large area without introducing the fattening effect is proposed.
Maximum-likelihood depth-from-defocus for active vision
  • W. Klarquist, W. Geisler, A. Bovik
  • Physics
    Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots
  • 1995
A new method for actively recovering depth information using image defocus is demonstrated and shown to support active stereo vision depth recovery by providing monocular depth estimates to guide the