• Corpus ID: 237605151

Rational Polynomial Camera Model Warping for Deep Learning Based Satellite Multi-View Stereo Matching

  title={Rational Polynomial Camera Model Warping for Deep Learning Based Satellite Multi-View Stereo Matching},
  author={Jian Gao and Jin Liu and Shunping Ji},
  • Jian Gao, Jin Liu, Shunping Ji
  • Published 23 September 2021
  • Engineering, Computer Science
  • ArXiv
Satellite multi-view stereo (MVS) imagery is particularly suited for large-scale Earth surface reconstruction. Differing from the perspective camera model (pin-hole model) that is commonly used for close-range and aerial cameras, the cubic rational polynomial camera (RPC) model is the mainstream model for push-broom linear-array satellite cameras. However, the homography warping used in the prevailing learning based MVS methods is only applicable to pin-hole cameras. In order to apply the SOTA… 

Figures and Tables from this paper


A Novel Recurrent Encoder-Decoder Structure for Large-Scale Multi-View Stereo Reconstruction From an Open Aerial Dataset
  • Jin Liu, Shunping Ji
  • Computer Science
    2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2020
A novel network, called RED-Net, for wide-range depth inference, which was developed from a recurrent encoder-decoder structure to regularize cost maps across depths and a 2D fully convolutional network as framework as framework, and it is proved that the RED- net model pre-trained on the synthetic WHU dataset can be efficiently transferred to very different multi-view aerial image datasets without any fine-tuning.
MVSNet: Depth Inference for Unstructured Multi-view Stereo
This work presents an end-to-end deep learning architecture for depth map inference from multi-view images that flexibly adapts arbitrary N-view inputs using a variance-based cost metric that maps multiple features into one cost feature.
MVSNet++: Learning Depth-Based Attention Pyramid Features for Multi-View Stereo
This work proposes MVSNet++, an end-to-end trainable network for dense depth estimation and designs three loss functions and integrate Curriculum Learning framework into the training process, which can lead to an accurate reconstruction of 3D model reconstruction.
P-MVSNet: Learning Patch-Wise Matching Confidence Aggregation for Multi-View Stereo
This paper proposes a new end-to-end deep learning network of P-MVSNet for multi-view stereo based on isotropic and anisotropic 3D convolutions and achieves the state-of-the-art performance over many existing methods on multi-View stereo.
Deep Stereo Using Adaptive Thin Volume Representation With Uncertainty Awareness
  • Shuo Cheng, Zexiang Xu, +4 authors Hao Su
  • Computer Science
    2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2020
The proposed ATV consists of only a small number of planes with low memory and computation costs; yet, it efficiently partitions local depth ranges within learned small uncertainty intervals, which enables reconstruction with high completeness and accuracy in a coarse-to-fine fashion.
Recurrent MVSNet for High-Resolution Multi-View Stereo Depth Inference
This paper introduces a scalable multi-view stereo framework based on the recurrent neural network that reduces dramatically the memory consumption, makes high-resolution reconstruction feasible, and demonstrates the scalability of the proposed method on several large-scale scenarios.
DeepMVS: Learning Multi-view Stereopsis
The results show that DeepMVS compares favorably against state-of-the-art conventional MVS algorithms and other ConvNet based methods, particularly for near-textureless regions and thin structures.
A Multi-view Stereo Benchmark with High-Resolution Images and Multi-camera Videos
This benchmark is the first to cover the important use case of hand-held mobile devices while also providing high-resolution DSLR camera images and provides data at significantly higher temporal and spatial resolution.
Semantic Stereo for Incidental Satellite Images
A large-scale public dataset including multi-view, multi-band satellite images and ground truth geometric and semantic labels for two large cities is established and lightweight public baselines adapted from recent state of the art convolutional neural network models are presented.
Cost Volume Pyramid Based Depth Inference for Multi-View Stereo
It is demonstrated that building a cost volume pyramid in a coarse-to-fine manner instead of constructing a costvolume at a fixed resolution leads to a compact, lightweight network and allows us inferring high resolution depth maps to achieve better reconstruction results.