Semantic Scene Completion via Integrating Instances and Scene in-the-Loop

  title={Semantic Scene Completion via Integrating Instances and Scene in-the-Loop},
  author={Yingjie Cai and Xuesong Chen and Chao Zhang and Kwan-Yee Lin and Xiaogang Wang and Hongsheng Li},
  journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
Semantic Scene Completion aims at reconstructing a complete 3D scene with precise voxel-wise semantics from a single-view depth or RGBD image. It is a crucial but challenging problem for indoor scene understanding. In this work, we present a novel framework named Scene-Instance-Scene Network (SISNet), which takes advantages of both in-stance and scene level semantic information. Our method is capable of inferring fine-grained shape details as well as nearby objects whose semantic categories are… 

Figures and Tables from this paper

Learning Local Displacements for Point Cloud Completion
A novel approach aimed at object and semantic scene completion from a partial scan represented as a 3D point cloud is proposed and a second model is introduced that assembles the authors' layers within a transformer architecture, achieving state-of-the-art performance.
3D Semantic Scene Completion: a Survey
This survey aims to identify, compare and analyze the techniques providing a critical analysis of the SSC literature on both methods and datasets, and provides an in-depth analysis ofThe existing works covering all choices made by the authors while highlighting the remaining avenues of research.
Learning a Structured Latent Space for Unsupervised Point Cloud Completion
A novel framework, which learns a unified and structured latent space that encoding both partial and complete point clouds, and consistently outperforms state-of-the-art unsupervised methods on both synthetic ShapeNet and real-world KITTI, ScanNet, and Matterport3D datasets.
MotionSC: Data Set and Network for Real-Time Semantic Mapping in Dynamic Environments
This work addresses a gap in semantic scene completion (SSC) data by creating a novel outdoor data set with accurate and complete dynamic scenes. Our data set is formed from randomly sampled views of
Data Augmented 3D Semantic Scene Completion with 2D Segmentation Priors
SPAwN is presented, a novel lightweight multimodal 3D deep CNN that seamlessly fuses structural data from the depth component of RGB-D images with semantic priors from a bimodal 2D segmentation network.
MonoScene: Monocular 3D Semantic Scene Completion
Experiments show the MonoScene framework outperform the literature on all metrics and datasets while hallucinating plausible scenery even beyond the camera field of view.


Semantic Scene Completion from a Single Depth Image
The semantic scene completion network (SSCNet) is introduced, an end-to-end 3D convolutional network that takes a single depth image as input and simultaneously outputs occupancy and semantic labels for all voxels in the camera view frustum.
Cascaded Context Pyramid for Full-Resolution 3D Semantic Scene Completion
This work proposes a novel deep learning framework, named Cascaded Context Pyramid Network (CCPNet), to jointly infer the occupancy and semantic labels of a volumetric 3D scene from a single depth image, and improves the labeling coherence with a cascaded context pyramid.
Semantic scene completion with dense CRF from a single depth image
3D Sketch-Aware Semantic Scene Completion via Semi-Supervised Structure Prior
A new geometry-based strategy to embed depth information with low-resolution voxel representation, which could still be able to encode sufficient geometric information, e.g., room layout, object’s sizes and shapes, to infer the invisible areas of the scene with well structure-preserving details is proposed.
Two Stream 3D Semantic Scene Completion
This work proposes a two stream approach that leverages depth information and semantic information, which is inferred from the RGB image, for this task and substantially outperforms the state-of-the-art for semantic scene completion.
Semantic Scene Completion Combining Colour and Depth: preliminary experiments
The potential of the RGB colour channels to improve SSCnet is investigated, a method that performs scene completion and semantic labelling in a single end-to-end 3D convolutional network.
SCFusion: Real-time Incremental Scene Reconstruction with Semantic Completion
This work proposes a framework that ameliorates scene reconstruction and semantic scene completion jointly in an incremental and real-time manner, based on an input sequence of depth maps and relies on a novel neural architecture designed to process occupancy maps and leverages voxel states to accurately and efficiently fuse semantic completion with the 3D global model.
Anisotropic Convolutional Networks for 3D Semantic Scene Completion
A novel module called anisotropic convolution is proposed, which properties with flexibility and power impossible for the competing methods such as standard 3D convolution and some of its variations are proposed.
See and Think: Disentangling Semantic Scene Completion
Experimental results show that regardless of inputing a single depth or RGB-D, the proposed disentangled framework can generate high-quality semantic scene completion, and outperforms state-of-the-art approaches on both synthetic and real datasets.
EdgeNet: Semantic Scene Completion from RGB-D images
A new strategy to encode colour information in 3D space using edge detection and flipped truncated signed distance is presented and EdgeNet, a new end-to-end neural network architecture capable of handling features generated from the fusion of depth and edge information is presented.