Similarity-Aware Fusion Network for 3D Semantic Segmentation
@article{Zhao2021SimilarityAwareFN, title={Similarity-Aware Fusion Network for 3D Semantic Segmentation}, author={Linqing Zhao and Jiwen Lu and Jie Zhou}, journal={2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, year={2021}, pages={1585-1592} }
In this paper, we propose a similarity-aware fusion network (SAFNet) to adaptively fuse 2D images and 3D point clouds for 3D semantic segmentation. Existing fusion-based methods achieve superior performances by integrating information from multiple modalities. However, they heavily rely on the projection-based correspondence between 2D pixels and 3D points and can only perform the information fusion in a fixed manner, so that their performances cannot be easily migrated to a more realistic…
Figures and Tables from this paper
One Citation
Learning Hybrid Semantic Affinity for Point Cloud Segmentation
- Computer ScienceIEEE Transactions on Circuits and Systems for Video Technology
- 2022
This paper presents a hybrid semantic affinity learning method (HSA) to capture and leverage the dependencies of categories for 3D semantic segmentation and proposes the concept of local affinity to effectively model the intra-class and inter-class semantic similarities for adjacent neighborhoods.
References
SHOWING 1-10 OF 51 REFERENCES
Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation
- Computer Science2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2020
A novel fusion-aware 3D point convolution which operates directly on the geometric surface being reconstructed and exploits effectively the inter-frame correlation for high-quality 3D feature learning is proposed.
Dense 3D semantic mapping of indoor scenes from RGB-D images
- Computer Science2014 IEEE International Conference on Robotics and Automation (ICRA)
- 2014
A novel 2D-3D label transfer based on Bayesian updates and dense pairwise 3D Conditional Random Fields and it is shown that it is not needed to obtain a semantic segmentation for every frame in a sequence in order to create accurate semantic 3D reconstructions.
Dense 3 D Semantic Mapping of Indoor Scenes from RGB-D Images
- Computer Science
- 2013
A novel 2D-3D label transfer based on Bayesian updates and dense pairwise 3D Conditional Random Fields and it is shown that it is not needed to obtain a semantic segmentation for every frame in a sequence in order to create accurate semantic 3D reconstructions.
Exploiting Local and Global Structure for Point Cloud Semantic Segmentation with Contextual Point Representations
- Computer ScienceNeurIPS
- 2019
This paper enrichs each point represen-tation by performing one novel gated fusion on the point itself and its contextual point representations, and proposes one novel graph pointnet module, relying on the graph attention block to dynamically com-pose and update each point representation within the local point cloud structure.
A Unified Point-Based Framework for 3D Segmentation
- Computer Science2019 International Conference on 3D Vision (3DV)
- 2019
A new unified point-based framework for 3D point cloud segmentation that effectively optimizes pixel-level features, geometrical structures and global context priors of an entire scene is presented and outperforms several state-of-theart approaches.
SemanticFusion: Dense 3D semantic mapping with convolutional neural networks
- Computer Science2017 IEEE International Conference on Robotics and Automation (ICRA)
- 2017
This work combines Convolutional Neural Networks (CNNs) and a state-of-the-art dense Simultaneous Localization and Mapping (SLAM) system, ElasticFusion, which provides long-term dense correspondences between frames of indoor RGB-D video even during loopy scanning trajectories, and produces a useful semantic 3D map.
Pix 2 Vox : Context-aware 3 D Reconstruction from Single and Multiview Images
- Computer Science
- 2019
A novel framework for single-view and multi-view 3D reconstruction, named Pix2Vox, which outperforms state-ofthe-arts by a large margin and is 24 times faster than 3D-R2N2 in terms of backward inference time.
Deep multimodal fusion for semantic image segmentation: A survey
- Computer ScienceImage Vis. Comput.
- 2021
Dual Attention Network for Scene Segmentation
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
New state-of-the-art segmentation performance on three challenging scene segmentation datasets, i.e., Cityscapes, PASCAL Context and COCO Stuff dataset is achieved without using coarse data.
DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
DenseFusion is a generic framework for estimating 6D pose of a set of known objects from RGB-D images that processes the two data sources individually and uses a novel dense fusion network to extract pixel-wise dense feature embedding, from which the pose is estimated.