• Corpus ID: 235265711

Semi-Supervised Disparity Estimation with Deep Feature Reconstruction

  title={Semi-Supervised Disparity Estimation with Deep Feature Reconstruction},
  author={Julia Guerrero-Viu and Sergio Izquierdo and Philipp Schr{\"o}ppel and Thomas Brox},
Despite the success of deep learning in disparity estimation, the domain generalization gap remains an issue. We propose a semi-supervised pipeline that successfully adapts DispNet to a real-world domain by joint supervised training on labeled synthetic data and self-supervised training on unlabeled real data. Furthermore, accounting for the limitations of the widely-used photometric loss, we analyze the impact of deep feature reconstruction as a promising supervisory signal for disparity… 

Figures and Tables from this paper


End-to-End Learning of Geometry and Context for Deep Stereo Regression
We propose a novel deep learning architecture for regressing disparity from a rectified pair of stereo images. We leverage knowledge of the problem’s geometry to form a cost volume using deep feature
Digging Into Self-Supervised Monocular Depth Estimation
It is shown that a surprisingly simple model, and associated design choices, lead to superior predictions, and together result in both quantitatively and qualitatively improved depth maps compared to competing self-supervised methods.
DispSegNet: Leveraging Semantics for End-to-End Learning of Disparity Estimation From Stereo Imagery
A CNN architecture that combines these two tasks to improve the quality and accuracy of disparity estimation with the help of semantic segmentation is designed, in which a single network is capable of outputting disparity estimates and semantic labels.
Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction
The use of stereo sequences for learning depth and visual odometry enables the use of both spatial and temporal photometric warp error, and constrains the scene depth and camera motion to be in a common, real-world scale.
A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation
  • N. Mayer, Eddy Ilg, T. Brox
  • Computer Science
    2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2016
This paper proposes three synthetic stereo video datasets with sufficient realism, variation, and size to successfully train large networks and presents a convolutional network for real-time disparity estimation that provides state-of-the-art results.
Feature-metric Loss for Self-supervised Learning of Depth and Egomotion
The proposed feature-metric loss is proposed and defined on feature representation, where the feature representation is also learned in a self-supervised manner and regularized by both first-order and second-order derivatives to constrain the loss landscapes to form proper convergence basins.
Unsupervised Learning of Stereo Matching
This paper presents a framework for learning stereo matching costs without human supervision by updating network parameters in an iterative manner and performs even comparably with other supervised methods.
Unsupervised Learning of Optical Flow with Deep Feature Similarity
The proposed method is a polarizing scheme, resulting in a more discriminative similarity map, which effectively improves the state-of-the-art techniques.
Reversing the cycle: self-supervised deep stereo through enhanced monocular distillation
A novel self-supervised paradigm reversing the link between monocular and stereo artefacts is proposed, in order to train deep stereo networks and achieves notable generalization capabilities dealing with domain shift issues.
High-Resolution Stereo Datasets with Subpixel-Accurate Ground Truth
We present a structured lighting system for creating high-resolution stereo datasets of static indoor scenes with highly accurate ground-truth disparities. The system includes novel techniques for