Accurate and Efficient Stereo Matching via Attention Concatenation Volume
@article{Xu2022AccurateAE, title={Accurate and Efficient Stereo Matching via Attention Concatenation Volume}, author={Gangwei Xu and Yun Wang and Junda Cheng and Jinhui Tang and Xin Yang}, journal={ArXiv}, year={2022}, volume={abs/2209.12699} }
—Stereo matching is a fundamental building block for many vision and robotics applications. An informative and concise cost volume representation is vital for stereo matching of high accuracy and efficiency. In this paper, we present a novel cost volume construction method, named attention concatenation volume (ACV), which generates attention weights from correlation clues to suppress redundant information and enhance matching-related information in the concatenation volume. The ACV can be…
Figures and Tables from this paper
One Citation
CGI-Stereo: Accurate and Real-Time Stereo Matching via Context and Geometry Interaction
- Computer ScienceArXiv
- 2023
The core of the CGI-Stereo is a Context and Geometry Fusion (CGF) block which adaptively fuses context and geometry information for more accurate andcient cost aggregation and meanwhile provides feedback to feature learning to guide more effective contextual feature extraction.
References
SHOWING 1-10 OF 32 REFERENCES
HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching
- Computer Science2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2021
HITNet is a novel neural network architecture for real-time stereo matching that not only geometrically reasons about disparities but also infers slanted plane hypotheses allowing to more accurately perform geometric warping and upsampling operations.
Group-Wise Correlation Stereo Network
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
Group-wise correlation provides efficient representations for measuring feature similarities and will not lose too much information like full correlation, and preserves better performance when reducing parameters compared with previous methods.
Pyramid Stereo Matching Network
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
PSMNet is a pyramid stereo matching network consisting of two main modules: spatial pyramid pooling and 3D CNN, which takes advantage of the capacity of global context information by aggregating context in different scales and locations to form a cost volume.
A Multi-view Stereo Benchmark with High-Resolution Images and Multi-camera Videos
- Computer Science, Environmental Science2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017
This benchmark is the first to cover the important use case of hand-held mobile devices while also providing high-resolution DSLR camera images and provides data at significantly higher temporal and spatial resolution.
End-to-End Learning of Geometry and Context for Deep Stereo Regression
- Computer Science2017 IEEE International Conference on Computer Vision (ICCV)
- 2017
We propose a novel deep learning architecture for regressing disparity from a rectified pair of stereo images. We leverage knowledge of the problem’s geometry to form a cost volume using deep feature…
A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation
- Computer Science2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2016
This paper proposes three synthetic stereo video datasets with sufficient realism, variation, and size to successfully train large networks and presents a convolutional network for real-time disparity estimation that provides state-of-the-art results.
Are we ready for autonomous driving? The KITTI vision benchmark suite
- Computer Science2012 IEEE Conference on Computer Vision and Pattern Recognition
- 2012
The autonomous driving platform is used to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection, revealing that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world.
Bilateral Grid Learning for Stereo Matching Networks
- Computer Science2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2021
A novel edge-preserving cost volume upsampling module based on the slicing operation in the learned bilateral grid that outperforms existing published real-time deep stereo matching networks, as well as some complex networks on the KITTI stereo datasets.
Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching
- Computer Science2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2020
This paper proposes a both memory and time efficient cost volume formulation that is complementary to existing multi-view stereo and stereo matching approaches based on 3D cost volumes and applies the cascade cost volume to the representative MVS-Net, obtaining a 35.6% improvement on DTU benchmark.
DeepPruner: Learning Efficient Stereo Matching via Differentiable PatchMatch
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
A differentiable PatchMatch module is developed that allows us to discard most disparities without requiring full cost volume evaluation and is able to efficiently compute the cost volume for high likelihood hypotheses and achieve savings in both memory and computation.