Meta-Sim2: Unsupervised Learning of Scene Structure for Synthetic Data Generation

  title={Meta-Sim2: Unsupervised Learning of Scene Structure for Synthetic Data Generation},
  author={Jeevan Devaranjan and Amlan Kar and Sanja Fidler},
  booktitle={European Conference on Computer Vision},
Procedural models are being widely used to synthesize scenes for graphics, gaming, and to create (labeled) synthetic datasets for ML. In order to produce realistic and diverse scenes, a number of parameters governing the procedural models have to be carefully tuned by experts. These parameters control both the structure of scenes being generated (e.g. how many cars in the scene), as well as parameters which place objects in valid configurations. Meta-Sim aimed at automatically tuning parameters… 

Self-Supervised Real-to-Sim Scene Generation

Sim2SG is a self-supervised automatic scene generation technique for matching the distribution of real data that does not require supervision from the real-world dataset, thus making it applicable in situations for which such annotations are difficult to obtain.

Neural-Sim: Learning to Generate Training Data with NeRF

This work presents the first fully differentiable synthetic data pipeline that uses Neural Radiance Fields (NeRFs) in a closed-loop with a target application’s loss function, and generates data on-demand, with no human labor, to maximize accuracy for a target task.

Meta-simulation for the Automated Design of Synthetic Overhead Imagery

This work proposes an approach to automatically choose the design of synthetic imagery based upon unlabeled real-world imagery, and shows that training segmentation models with NAMS-designed imagery yields superior results compared to naïve randomized designs and state-of-the-art meta-simulation methods.

Semantically Controllable Scene Generation with Guidance of Explicit Knowledge

A novel method to incorporate domain knowledge explicitly in the generation process to achieve semantically controllable scene generation by imposing semantic rules on properties of nodes and edges in the tree structure is introduced.

SceneGen: Learning to Generate Realistic Traffic Scenes

SceneGen is presented—a neural autoregressive model of traffic scenes that eschews the need for rules and heuristics and can be used to train perception models that generalize to the real world.

Task2Sim: Towards Effective Pre-training and Transfer from Synthetic Data

Task2Sim, a unified model mapping downstream task representations to optimal sim-ulation parameters to generate synthetic pre-training data for them, is introduced, which results in significantly better downstream per-formance than non-adaptively choosing simulation param-eters on both seen and unseen tasks.

Photo-realistic Neural Domain Randomization

It is shown that the recent progress in neural rendering enables a new unified approach to learn a composition of neural networks that acts as a physics-based ray tracer generating high-quality renderings from scene geometry alone, called Photo-realistic Neural Domain Randomization (PNDR).

ATISS: Autoregressive Transformers for Indoor Scene Synthesis

ATISS is presented, a novel autoregressive transformer architecture for creating diverse and plausible synthetic indoor environments, given only the room type and its floor plan, which has fewer parameters, is simpler to implement and train and runs up to 8x faster than existing methods.

Graph-based Cluttered Scene Generation and Interactive Exploration using Deep Reinforcement Learning

A novel method to teach a robotic agent to interactively explore cluttered yet structured scenes, such as kitchen pantries and grocery shelves, by leveraging the physical plausibility of the scene by incorporating a novel scene grammar to represent structured clutter is introduced.

Beyond RGB: Scene-Property Synthesis with Neural Radiance Fields

This paper provides a new approach to scene understanding, from a synthesis model perspective, by leveraging the recent progress on implicit scene representation and neural rendering, and introduces Scene-Property Synthesis with NeRF (SS-NeRF).



Meta-Sim: Learning to Generate Synthetic Datasets

Meta-Sim is proposed, which learns a generative model of synthetic scenes, and obtain images as well as its corresponding ground-truth via a graphics engine, and can greatly improve content generation quality over a human-engineered probabilistic scene grammar.

Attend, Infer, Repeat: Fast Scene Understanding with Generative Models

We present a framework for efficient inference in structured image models that explicitly reason about objects. We achieve this by performing probabilistic inference using a recurrent neural network

Picture: A probabilistic programming language for scene perception

Picture is presented, a probabilistic programming language for scene understanding that allows researchers to express complex generative vision models, while automatically solving them using fast general-purpose inference machinery.

Neural Scene De-rendering

This work proposes a new approach to learn an interpretable distributed representation of scenes, using a deterministic rendering function as the decoder and a object proposal based encoder that is trained by minimizing both the supervised prediction and the unsupervised reconstruction errors.

The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes

This paper generates a synthetic collection of diverse urban images, named SYNTHIA, with automatically generated class annotations, and conducts experiments with DCNNs that show how the inclusion of SYnTHIA in the training stage significantly improves performance on the semantic segmentation task.

Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks

This work introduces a large-scale synthetic dataset with 500K physically-based rendered images from 45K realistic 3D indoor scenes and shows that pretraining with this new synthetic dataset can improve results beyond the current state of the art on all three computer vision tasks.

Understanding RealWorld Indoor Scenes with Synthetic Data

This work focuses its attention on depth based semantic per-pixel labelling as a scene understanding problem and shows the potential of computer graphics to generate virtually unlimited labelled data from synthetic 3D scenes.

LayoutVAE: Stochastic Scene Layout Generation From a Label Set

LayoutVAE is a versatile modeling framework that allows for generating full image layouts given a label set, or per label layouts for an existing image given a new label, and is also capable of detecting unusual layouts, potentially providing a way to evaluate layout generation problem.

Structured Domain Randomization: Bridging the Reality Gap by Context-Aware Synthetic Data

The power of SDR is demonstrated for the problem of 2D bounding box car detection, achieving competitive results on real data after training only on synthetic data and outperforms other approaches to generating synthetic data as well as real data collected in a different domain.

Human-Centric Indoor Scene Synthesis Using Stochastic Grammar

We present a human-centric method to sample and synthesize 3D room layouts and 2D images thereof, to obtain large-scale 2D/3D image data with the perfect per-pixel ground truth. An attributed spatial