Anisotropic Convolutional Networks for 3D Semantic Scene Completion

@article{Li2020AnisotropicCN,
  title={Anisotropic Convolutional Networks for 3D Semantic Scene Completion},
  author={Jie Li and K. Han and Peng Wang and Yu Liu and Xia Yuan},
  journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020},
  pages={3348-3356}
}
  • Jie Li, K. Han, +2 authors Xia Yuan
  • Published 5 April 2020
  • Computer Science
  • 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
As a voxel-wise labeling task, semantic scene completion (SSC) tries to simultaneously infer the occupancy and semantic labels for a scene from a single depth and/or RGB image. The key challenge for SSC is how to effectively take advantage of the 3D context to model various objects or stuffs with severe variations in shapes, layouts, and visibility. To handle such variations, we propose a novel module called anisotropic convolution, which properties with flexibility and power impossible for the… Expand
S3CNet: A Sparse Semantic Scene Completion Network for LiDAR Point Clouds
TLDR
A method that subsumes the sparsity of large-scale environments and presents S3CNet, a sparse convolution based neural network that predicts the semantically completed scene from a single, unified LiDAR point cloud, achieving state-of-the art results on the SemanticKITTI benchmark. Expand
IMENet: Joint 3D Semantic Scene Completion and 2D Semantic Segmentation through Iterative Mutual Enhancement
TLDR
This work argues that this sequential scheme does not ensure these two tasks fully benefit each other, and presents an Iterative Mutual Enhancement Network (IMENet) to solve them jointly, which interactively refines the two tasks at the late prediction stage. Expand
RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction
TLDR
RfD-Net is introduced that jointly detects and reconstructs dense object surfaces directly from raw point clouds and consistently outperforms the state-of-the-arts and improves over 11 of mesh IoU in object reconstruction. Expand
Point Cloud Semantic Scene Completion from RGB-D Images
TLDR
A novel semantic completion network, called point cloud semantic scene completion network (PCSSC-Net), for indoor scenes solely based on point clouds, which designs a patch-based contextual encoder to extract and infer comprehensive information from partial input, and articulate a semantics-guided completion decoder. Expand
LSMVOS: Long-Short-Term Similarity Matching for Video Object
TLDR
A new propagation method, uses short-term matching modules to extract the information of the previous frame and apply it in propagation, and proposes the network of Long-Short-Term similarity matching for video object segmentation (LSMOVS). Expand
3D Semantic Scene Completion: a Survey
TLDR
This survey aims to identify, compare and analyze the techniques providing a critical analysis of the SSC literature on both methods and datasets, and provides an in-depth analysis ofThe existing works covering all choices made by the authors while highlighting the remaining avenues of research. Expand
Data Augmented 3D Semantic Scene Completion with 2D Segmentation Priors
Semantic scene completion (SSC) is a challenging Computer Vision task with many practical applications, from robotics to assistive computing. Its goal is to infer the 3D geometry in a field of viewExpand
Dynamic Neural Networks: A Survey
TLDR
This survey comprehensively review this rapidly developing area of dynamic networks by dividing dynamic networks into three main categories: sample-wise dynamic models that process each sample with data-dependent architectures or parameters; spatial-wiseynamic networks that conduct adaptive computation with respect to different spatial locations of image data; and temporal-wise Dynamic networks that perform adaptive inference along the temporal dimension for sequential data. Expand
Semantic Scene Completion via Integrating Instances and Scene in-the-Loop
TLDR
This work presents a novel framework named Scene-Instance-Scene Network (SISNet), which takes advantages of both in-stance and scene level semantic information, and is capable of inferring fine-grained shape details as well as nearby objects whose semantic categories are easily mixed-up. Expand

References

SHOWING 1-10 OF 27 REFERENCES
Semantic Scene Completion from a Single Depth Image
TLDR
The semantic scene completion network (SSCNet) is introduced, an end-to-end 3D convolutional network that takes a single depth image as input and simultaneously outputs occupancy and semantic labels for all voxels in the camera view frustum. Expand
Efficient Semantic Scene Completion Network with Spatial Group Convolution
TLDR
An efficient 3D sparse convolutional network is presented, which harnesses a multiscale architecture and a coarse-to-fine prediction strategy and achieves state-of-the-art performance and fast speed. Expand
Two Stream 3D Semantic Scene Completion
TLDR
This work proposes a two stream approach that leverages depth information and semantic information, which is inferred from the RGB image, for this task and substantially outperforms the state-of-the-art for semantic scene completion. Expand
RGBD Based Dimensional Decomposition Residual Network for 3D Semantic Scene Completion
  • Jie Li, Y. Liu, +4 authors I. Reid
  • Computer Science
  • 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2019
TLDR
A light-weight Dimensional Decomposition Residual network (DDR) for 3D dense prediction tasks is introduced and the proposed multi-scale fusion mechanism for depth and color image can improve the completion and segmentation accuracy simultaneously. Expand
View-Volume Network for Semantic Scene Completion from a Single Depth Image
TLDR
A View-Volume convolutional neural network (VVNet) for inferring the occupancy and semantic labels of a volumetric 3D scene from a single depth image and demonstrates its efficiency and effectiveness on both synthetic SUNCG and real NYU dataset. Expand
Depth Based Semantic Scene Completion With Position Importance Aware Loss
TLDR
PALNet is proposed, a novel hybrid network for SSC based on single depth using a two-stream network to extract both 2D and 3D features from multi-stages using fine-grained depth information to efficiently capture the context, as well as the geometric cues of the scene. Expand
Joint 3D Object and Layout Inference from a Single RGB-D Image
TLDR
This work proposes a high-order graphical model and jointly reason about the layout, objects and superpixels in the image and demonstrates that the proposed method is able to infer scenes with a large degree of clutter and occlusions. Expand
Structured Prediction of Unobserved Voxels from a Single Depth Image
TLDR
This work proposes an algorithm that can complete the unobserved geometry of tabletop-sized objects, based on a supervised model trained on already available volumetric elements, that maps from a local observation in a single depth image to an estimate of the surface shape in the surrounding neighborhood. Expand
Rethinking Atrous Convolution for Semantic Image Segmentation
TLDR
The proposed `DeepLabv3' system significantly improves over the previous DeepLab versions without DenseCRF post-processing and attains comparable performance with other state-of-art models on the PASCAL VOC 2012 semantic image segmentation benchmark. Expand
Deformable Convolutional Networks
TLDR
This work introduces two new modules to enhance the transformation modeling capability of CNNs, namely, deformable convolution and deformable RoI pooling, based on the idea of augmenting the spatial sampling locations in the modules with additional offsets and learning the offsets from the target tasks, without additional supervision. Expand
...
1
2
3
...