A Novel Recurrent Encoder-Decoder Structure for Large-Scale Multi-View Stereo Reconstruction From an Open Aerial Dataset

@article{Liu2020ANR,
  title={A Novel Recurrent Encoder-Decoder Structure for Large-Scale Multi-View Stereo Reconstruction From an Open Aerial Dataset},
  author={Jin Liu and Shunping Ji},
  journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020},
  pages={6049-6058}
}
  • Jin Liu, Shunping Ji
  • Published 2 March 2020
  • Computer Science
  • 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
A great deal of research has demonstrated recently that multi-view stereo (MVS) matching can be solved with deep learning methods. However, these efforts were focused on close-range objects and only a very few of the deep learning-based methods were specifically designed for large-scale 3D urban reconstruction due to the lack of multi-view aerial image benchmarks. In this paper, we present a synthetic aerial dataset, called the WHU dataset, we created for MVS tasks, which, to our knowledge, is… 
Rational Polynomial Camera Model Warping for Deep Learning Based Satellite Multi-View Stereo Matching
  • Jian Gao, Jin Liu, Shunping Ji
  • Engineering, Computer Science
    ArXiv
  • 2021
TLDR
Experiments show that the proposed RPC warping module and the SatMVS framework can achieve a superior reconstruction accuracy compared to the pin-hole fitting method and conventional MVS methods.
ResDepth: A Deep Residual Prior For 3D Reconstruction From High-resolution Satellite Images
TLDR
ResDepth, a convolutional neural network that learns an expressive geometric prior from example data, is introduced and it is demonstrated that the prior encoded in the network weights captures meaningful geometric characteristics of urban design, which also generalize across different districts and even from one city to another.
ResDepth: A Deep Prior For 3D Reconstruction From High-resolution Satellite Images
TLDR
This work introduces RESDEPTH, a convolutional neural network that learns such an expressive geometric prior from example data, and demonstrates that the prior encoded in the network weights captures meaningful geometric characteristics of urban design, which also generalize across different districts and even from one city to another.
ORStereo: Occlusion-Aware Recurrent Stereo Matching for 4K-Resolution Images
TLDR
The Occlusion-aware Recurrent binocular Stereo matching (ORStereo), which deals with these issues by only training on available low disparity range stereo images by formulating the task as residual updates and refinements of an initial prediction.
3D Building Reconstruction from Monocular Remote Sensing Images
3D building reconstruction from monocular remote sensing imagery is an important research problem and an economic solution to large-scale city modeling, compared with reconstruction from LiDAR data
S2Looking: A Satellite Side-Looking Dataset for Building Change Detection
  • Li Shen, Yao Lu, +8 authors Bitao Jiang
  • Computer Science, Engineering
    ArXiv
  • 2021
TLDR
This paper introduces S2Looking, a building change detection dataset that contains large-scale side-looking satellite images captured at varying off-nadir angles and test several state-of-the-art methods on both the S2 looking dataset and the (near- nadir) LEVIR-CD+ dataset.
A Decomposition Model for Stereo Matching
TLDR
A decomposition model for stereo matching is presented to solve the problem of excessive growth in computational cost as the resolution increases and achieves 10−100× speed increase while obtaining comparable disparity estimation results.
Limited-angle tomographic reconstruction of dense layered objects by dynamical machine learning
TLDR
A recurrent neural network architecture with a novel split-convolutional gated recurrent unit (SC-GRU) as the fundamental building block is devised and it is shown that the dynamic method improves upon previous static approaches with fewer artifacts and better overall reconstruction fidelity.
Dynamical machine learning volumetric reconstruction of objects’ interiors from limited angular views
TLDR
The dynamic method is suitable for a generic interior-volumetric reconstruction under a limited-angle scheme and accurately reconstructs volume interiors under two conditions: weak scattering, when the Radon transform approximation is applicable and the forward operator well defined; and strong scattering, which is nonlinear with respect to the 3D refractive index distribution and includes uncertainty in the forward operators.
Deep neural networks to improve the dynamic range of Zernike phase-contrast wavefront sensing in high-contrast imaging systems
TLDR
This work investigates the use of two different types of machine learning algorithms to extend the dynamic range of the ZWFS, and presents static and dynamic deep learning architectures for single- and multi-wavelength measurements, respectively.
...
1
2
...

References

SHOWING 1-10 OF 44 REFERENCES
Recurrent MVSNet for High-Resolution Multi-View Stereo Depth Inference
TLDR
This paper introduces a scalable multi-view stereo framework based on the recurrent neural network that reduces dramatically the memory consumption, makes high-resolution reconstruction feasible, and demonstrates the scalability of the proposed method on several large-scale scenarios.
MVSNet: Depth Inference for Unstructured Multi-view Stereo
TLDR
This work presents an end-to-end deep learning architecture for depth map inference from multi-view images that flexibly adapts arbitrary N-view inputs using a variance-based cost metric that maps multiple features into one cost feature.
DeepMVS: Learning Multi-view Stereopsis
TLDR
The results show that DeepMVS compares favorably against state-of-the-art conventional MVS algorithms and other ConvNet based methods, particularly for near-textureless regions and thin structures.
Pyramid Stereo Matching Network
TLDR
PSMNet is a pyramid stereo matching network consisting of two main modules: spatial pyramid pooling and 3D CNN, which takes advantage of the capacity of global context information by aggregating context in different scales and locations to form a cost volume.
Cascade Residual Learning: A Two-Stage Convolutional Neural Network for Stereo Matching
TLDR
A novel cascade CNN architecture composing of two stages that advances the recently proposed DispNet by equipping it with extra up-convolution modules, leading to disparity images with more details, and shows that residual learning provides more effective refinement.
Learning a Multi-View Stereo Machine
TLDR
End-to-end learning allows us to jointly reason about shape priors while conforming geometric constraints, enabling reconstruction from much fewer images than required by classical approaches as well as completion of unseen surfaces.
GA-Net: Guided Aggregation Net for End-To-End Stereo Matching
TLDR
Two novel neural net layers, aimed at capturing local and the whole-image cost dependencies respectively are proposed, which can be used to replace the widely used 3D convolutional layer which is computationally costly and memory-consuming as it has cubic computational/memory complexity.
A Multi-view Stereo Benchmark with High-Resolution Images and Multi-camera Videos
TLDR
This benchmark is the first to cover the important use case of hand-held mobile devices while also providing high-resolution DSLR camera images and provides data at significantly higher temporal and spatial resolution.
End-to-End Learning of Geometry and Context for Deep Stereo Regression
We propose a novel deep learning architecture for regressing disparity from a rectified pair of stereo images. We leverage knowledge of the problem’s geometry to form a cost volume using deep feature
A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation
  • N. Mayer, Eddy Ilg, +4 authors T. Brox
  • Computer Science, Mathematics
    2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2016
TLDR
This paper proposes three synthetic stereo video datasets with sufficient realism, variation, and size to successfully train large networks and presents a convolutional network for real-time disparity estimation that provides state-of-the-art results.
...
1
2
3
4
5
...