Corpus ID: 237278052

DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization

  title={DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization},
  author={Cheng Zhang and Zhaopeng Cui and Cai Chen and Shuaicheng Liu and Bing Zeng and Hujun Bao and Yinda Zhang},
  • Cheng Zhang, Zhaopeng Cui, +4 authors Yinda Zhang
  • Published 2021
  • Computer Science
  • ArXiv
Panorama images have a much larger field-of-view thus naturally encode enriched scene context information compared to standard perspective images, which however is not well exploited in the previous scene understanding methods. In this paper, we propose a novel method for panoramic 3D scene understanding which recovers the 3D room layout and the shape, pose, position, and semantic category for each object from a single full-view panorama image. In order to fully utilize the rich context… Expand

Figures and Tables from this paper


PanoContext: A Whole-Room 3D Context Model for Panoramic Scene Understanding
Experiments show that solely based on 3D context without any image region category classifier, the proposed whole-room context model can achieve a comparable performance with the state-of-the-art object detector, demonstrating that when the FOV is large, context is as powerful as object appearance. Expand
DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding
This paper presents an approach to embed 3D context into the topology of a neural network trained to perform holistic scene understanding, and generates partially synthetic depth images which are rendered by replacing real objects with a repository of CAD models of the same object category. Expand
LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image
An algorithm to predict room layout from a single image that generalizes across panoramas and perspective images, cuboid layouts and more general layouts (e.g. "L"-shape room) is proposed, which achieves among the best accuracy for perspective images and can handle both cuboid-shaped and moregeneral Manhattan layouts. Expand
HorizonNet: Learning Room Layout With 1D Representation and Pano Stretch Data Augmentation
The proposed network, HorizonNet, trained for predicting 1D layout, outperforms previous state-of-the-art approaches and can diversify panorama data and be applied to other panorama-related learning tasks. Expand
SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans
A message-passing graph neural network is proposed to model the inter-relationships between objects and layout, guiding generation of a globally object alignment in a scene by considering the global scene layout. Expand
DuLa-Net: A Dual-Projection Network for Estimating Room Layouts From a Single RGB Panorama
A deep learning framework, called DuLa-Net, to predict Manhattan-world 3D room layouts from a single RGB panorama that leverages two projections of the panorama at once, namely the equirectangular panorama-view and the perspective ceiling-view, that each contains different clues about the room layouts. Expand
Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes From a Single Image
This paper proposes an end-to-end solution to jointly reconstruct room layout, object bounding boxes and meshes from a single image, and argues that understanding the context of each component can assist the task of parsing the others, which enables joint understanding and reconstruction. Expand
Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling
This paper presents a new synthetic dataset, Structured3D, with the aim of providing large-scale photo-realistic images with rich 3D structure annotations for a wide spectrum of structured 3D modeling tasks, and takes advantage of the availability of professional interior designs to automatically extract 3D structures from them. Expand
Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image
A Holistic Scene Grammar (HSG) is introduced to represent the 3D scene structure, which characterizes a joint distribution over the functional and geometric space of indoor scenes, and significantly outperforms prior methods on 3D layout estimation, 3D object detection, and holistic scene understanding. Expand
SeeThrough: Finding Chairs in Heavily Occluded Indoor Scene Images
This work uses a neural network trained on real indoor annotated images to extract 2D keypoints, and solves a global selection problem among 3D candidates using pairwise co-occurrence statistics discovered from a large 3D scene database. Expand