Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling

  title={Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling},
  author={Jia Zheng and Junfei Zhang and Jing Li and Rui Tang and Shenghua Gao and Zihan Zhou},
Recently, there has been growing interest in developing learning-based methods to detect and utilize salient semi-global or global structures, such as junctions, lines, planes, cuboids, smooth surfaces, and all types of symmetries, for 3D scene modeling and understanding. [] Key Method We take advantage of the availability of millions of professional interior designs and automatically extract 3D structures from them. We generate high-quality images with an industry-leading rendering engine. We use our…

BuildingNet: Learning to Label 3D Buildings

A large-scale dataset of 3D building models whose exteriors are consistently labeled, and a graph neural network that labels building meshes by analyzing spatial and structural relations of their geometric primitives that significantly improves performance over several baselines for labeling 3D meshes are introduced.

Automatic 3D reconstruction of structured indoor environments

An up-to-date integrative view of the field, bridging complementary views coming from computer graphics and computer vision is provided, and the structure of output models and the priors exploited to bridge the gap between imperfect sources and desired output are defined.

OpenRooms: An Open Framework for Photorealistic Indoor Scene Datasets

This work proposes a novel framework for creating large-scale photorealistic datasets of indoor scenes, with ground truth geometry, material, lighting and semantics, and shows that deep networks trained on the proposed dataset achieve competitive performance for shape, material and lighting estimation on real images.

Monocular Spherical Depth Estimation with Explicitly Connected Weak Layout Cues

Synthetic 3D Data Generation Pipeline for Geometric Deep Learning in Architecture

A field-specific synthetic data generation pipeline that generates an arbitrary amount of 3D data along with the associated 2D and 3D annotations that is suitable for multiple deep dearning tasks, including geometric deep learning that requires direct 3D supervision.

OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene Datasets

This work aims to make the dataset creation process for indoor scenes widely accessible, allowing researchers to transform casually acquired scans to large-scale datasets with high-quality ground truth, by estimating consistent furniture and scene layout and ascribing high quality materials to all surfaces.

Layout-Guided Novel View Synthesis from a Single Indoor Panorama

This paper makes the first attempt to generate novel views from a single indoor panorama and take the large camera translations into consideration and uses Convolutional Neural Networks to extract the deep features and estimate the depth map from the source-view image.

MINERVAS: Massive INterior EnviRonments VirtuAl Synthesis

MINERVAS, a Massive INterior EnviRonments VirtuAl Synthesis system, to facilitate the 3D scene modification and the 2D image synthesis for various vision tasks, and empowers users to access commercial scene databases with millions of indoor scenes and protects the copyright of core data assets.

PanoDR: Spherical Panorama Diminished Reality for Indoor Scenes

This work proposes a model that initially predicts the structure of a indoor scene and then uses it to guide the reconstruction of an empty – background only – representation of the same scene, to ensure structure-aware counterfactual inpainting.

Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding

  • Mike RobertsNathan Paczan
  • Computer Science, Environmental Science
    2021 IEEE/CVF International Conference on Computer Vision (ICCV)
  • 2021
This work introduces Hypersim, a photorealistic synthetic dataset for holistic indoor scene understanding, and finds that it is possible to generate the entire dataset from scratch, for roughly half the cost of training a popular open-source natural language processing model.



Learning to Reconstruct 3D Manhattan Wireframes From a Single Image

This work proposes a method to effectively exploit the global structural regularities for obtaining a compact, accurate, and intuitive 3D wireframe representation by trains a single convolutional neural network to simultaneously detect salient junctions and straight lines, as well as predict their 3D depth and vanishing points.

InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset

This dataset leverages the availability of millions of professional interior designs and millions of production-level furniture and object assets to provide a higher degree of photo-realism, larger scale, more variability as well as serving a wider range of purposes compared to existing datasets.

3D Interpreter Networks for Viewer-Centered Wireframe Modeling

This work proposes 3D INterpreter Networks (3D-INN), an end-to-end trainable framework that sequentially estimates 2D keypoint heatmaps and 3D object skeletons and poses and proposes a Projection Layer, mapping estimated 3D structure back to 2D.

Recovering 3D Planes from a Single Image via Convolutional Neural Networks

A novel plane structure-induced loss is proposed to train the network to simultaneously predict a plane segmentation map and the parameters of the 3D planes, which significantly outperforms existing methods, both qualitatively and quantitatively.

3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera

A semi-automatic framework that employs existing detection methods and enhances them using two main constraints: framing of query images sampled on panoramas to maximize the performance of 2D detectors, and multi-view consistency enforcement across 2D detections that originate in different camera locations.

Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks

This work introduces a large-scale synthetic dataset with 500K physically-based rendered images from 45K realistic 3D indoor scenes and shows that pretraining with this new synthetic dataset can improve results beyond the current state of the art on all three computer vision tasks.

Learning Semantic Segmentation From Synthetic Data: A Geometrically Guided Input-Output Adaptation Approach

This work proposes an approach to cross-domain semantic segmentation with the auxiliary geometric information, which can also be easily obtained from virtual environments, and achieves a clear performance gain compared to the baselines and various competing methods.

Joint 2D-3D-Semantic Data for Indoor Scene Understanding

A dataset of large-scale indoor spaces that provides a variety of mutually registered modalities from 2D, 2.5D and 3D domains, with instance-level semantic and geometric annotations, enables development of joint and cross-modal learning models and potentially unsupervised approaches utilizing the regularities present in large- scale indoor spaces.

Single-Image Piece-Wise Planar 3D Reconstruction via Associative Embedding

A novel two-stage method based on associative embedding, inspired by its recent success in instance segmentation, that is able to detect an arbitrary number of planes and facilitate many real-time applications such as visual SLAM and human-robot interaction.

SUN RGB-D: A RGB-D scene understanding benchmark suite

This paper introduces an RGB-D benchmark suite for the goal of advancing the state-of-the-arts in all major scene understanding tasks, and presents a dataset that enables the train data-hungry algorithms for scene-understanding tasks, evaluate them using meaningful 3D metrics, avoid overfitting to a small testing set, and study cross-sensor bias.