• Corpus ID: 189762191

The Replica Dataset: A Digital Replica of Indoor Spaces

  title={The Replica Dataset: A Digital Replica of Indoor Spaces},
  author={Julian Straub and Thomas Whelan and Lingni Ma and Yufan Chen and Erik Wijmans and Simon Green and Jakob J. Engel and Raul Mur-Artal and Carl Yuheng Ren and Shobhit Verma and Anton Clarkson and Ming Yan and Brian Budge and Yajie Yan and Xiaqing Pan and June Yon and Yuyang Zou and Kimberly Leon and Nigel Carter and Jesus Briales and Tyler Gillingham and Elias Mueggler and Luis Pesqueira and Manolis Savva and Dhruv Batra and Hauke Malte Strasdat and Renzo De Nardi and Michael Goesele and S. Lovegrove and Richard A. Newcombe},
We introduce Replica, a dataset of 18 highly photo-realistic 3D indoor scene reconstructions at room and building scale. Each scene consists of a dense mesh, high-resolution high-dynamic-range (HDR) textures, per-primitive semantic class and instance information, and planar mirror and glass reflectors. The goal of Replica is to enable machine learning (ML) research that relies on visually, geometrically, and semantically realistic generative models of the world - for instance, egocentric… 

Figures and Tables from this paper

OpenRooms: An Open Framework for Photorealistic Indoor Scene Datasets
This work proposes a novel framework for creating large-scale photorealistic datasets of indoor scenes, with ground truth geometry, material, lighting and semantics, and shows that deep networks trained on the proposed dataset achieve competitive performance for shape, material and lighting estimation on real images.
OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene Datasets
This work aims to make the dataset creation process for indoor scenes widely accessible, allowing researchers to transform casually acquired scans to large-scale datasets with high-quality ground truth, by estimating consistent furniture and scene layout and ascribing high quality materials to all surfaces.
3D-Aware Indoor Scene Synthesis with Depth Priors
This work proposes a dualpath generator, where one path is responsible for depth generation, whose intermediate features are injected into the other path as the condition for appearance rendering, which eases the 3D-aware synthesis with explicit geometry information.
Material and Lighting Reconstruction for Complex Indoor Scenes with Texture-space Differentiable Rendering
This work presents a robust optimization pipeline based on differentiable rendering to recover physically based materials and illumination, leveraging RGB and geometry captures, and introduces a novel texture-space sampling technique and carefully chosen inductive priors to help guide reconstruction, avoiding low-quality or implausible local minima.
Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding
  • Mike Roberts, Nathan Paczan
  • Computer Science, Environmental Science
    2021 IEEE/CVF International Conference on Computer Vision (ICCV)
  • 2021
This work introduces Hypersim, a photorealistic synthetic dataset for holistic indoor scene understanding, and finds that it is possible to generate the entire dataset from scratch, for roughly half the cost of training a popular open-source natural language processing model.
A Learned Stereo Depth System for Robotic Manipulation in Homes
We present a passive stereo depth system that produces dense and accurate point clouds optimized for human environments, including dark, textureless, thin, reflective and specular surfaces and
Embodied Navigation at the Art Gallery
This paper builds and releases a new 3D space with unique characteristics: the one of a complete art museum, named ArtGallery3D (AG3D), which is ampler, richer in visual features, and provides very sparse occupancy information.
Beyond RGB: Scene-Property Synthesis with Neural Radiance Fields
This paper provides a new approach to scene understanding, from a synthesis model perspective, by leveraging the recent progress on implicit 3D representation and neural rendering by introducing SceneProperty Synthesis with NeRF (SS-NeRF), a powerful tool for bridging generative learning and discriminative learning.
MINERVAS: Massive INterior EnviRonments VirtuAl Synthesis
MINERVAS, a Massive INterior EnviRonments VirtuAl Synthesis system, to facilitate the 3D scene modification and the 2D image synthesis for various vision tasks and empowers users to access commercial scene databases with millions of indoor scenes and protects the copyright of core data assets.
Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI
Habitat-Matterport 3D is a large-scale dataset of 1,000 building-scale 3D reconstructions from a diverse set of real-world locations that is ‘pareto optimal’ in the following sense – agents trained to perform PointGoal navigation on HM3D achieve the highest performance regardless of whether they are evaluated onHM3D, Gibson, or MP3D.


InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset
This dataset leverages the availability of millions of professional interior designs and millions of production-level furniture and object assets to provide a higher degree of photo-realism, larger scale, more variability as well as serving a wider range of purposes compared to existing datasets.
SceneNet: Understanding Real World Indoor Scenes With Synthetic Data
This work focuses its attention on depth based semantic per-pixel labelling as a scene understanding problem and shows the potential of computer graphics to generate virtually unlimited labelled data from synthetic 3D scenes by carefully synthesizing training data with appropriate noise models.
The RobotriX: An Extremely Photorealistic and Very-Large-Scale Indoor Dataset of Sequences with Robot Trajectories and Interactions
The RobotriX is an extremely photorealistic indoor dataset designed to enable the application of deep learning techniques to a wide variety of robotic vision problems and will serve as a new milestone for investigating 2D and 3D robotic vision tasks with large-scale data-driven techniques.
ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes
This work introduces ScanNet, an RGB-D video dataset containing 2.5M views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations, and shows that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks.
Semantic Scene Completion from a Single Depth Image
The semantic scene completion network (SSCNet) is introduced, an end-to-end 3D convolutional network that takes a single depth image as input and simultaneously outputs occupancy and semantic labels for all voxels in the camera view frustum.
Matterport3D: Learning from RGB-D Data in Indoor Environments
Matterport3D is introduced, a large-scale RGB-D dataset containing 10,800 panoramic views from 194,400RGB-D images of 90 building-scale scenes that enable a variety of supervised and self-supervised computer vision tasks, including keypoint matching, view overlap prediction, normal prediction from color, semantic segmentation, and region classification.
Reconstructing scenes with mirror and glass surfaces
This work introduces a fully automatic pipeline that allows us to reconstruct the geometry and extent of planar glass and mirror surfaces while being able to distinguish between the two.
3D Semantic Parsing of Large-Scale Indoor Spaces
This paper argues that identification of structural elements in indoor spaces is essentially a detection problem, rather than segmentation which is commonly used, and proposes a method for semantic parsing the 3D point cloud of an entire building using a hierarchical approach.
Microsoft COCO: Common Objects in Context
We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene
Example-based synthesis of 3D object arrangements
This work introduces a probabilistic model for scenes based on Bayesian networks and Gaussian mixtures that can be trained from a small number of input examples, and develops a clustering algorithm that groups objects occurring in a database of scenes according to their local scene neighborhoods.