• Corpus ID: 219177225

OPAL-Net: A Generative Model for Part-based Object Layout Generation

@article{Baghel2020OPALNetAG,
  title={OPAL-Net: A Generative Model for Part-based Object Layout Generation},
  author={Rishabh Baghel and Ravi Kiran Sarvadevabhatla},
  journal={ArXiv},
  year={2020},
  volume={abs/2006.00190}
}
We propose OPAL-Net, a novel hierarchical architecture for part-based layout generation of objects from multiple categories using a single unified model. We adopt a coarse-to-fine strategy involving semantically conditioned autoregressive generation of bounding box layouts and pixel-level part layouts for objects. We use Graph Convolutional Networks, Deep Recurrent Networks along with custom-designed Conditional Variational Autoencoders to enable flexible, diverse and category-aware generation… 

References

SHOWING 1-10 OF 37 REFERENCES

LayoutVAE: Stochastic Scene Layout Generation From a Label Set

TLDR
LayoutVAE is a versatile modeling framework that allows for generating full image layouts given a label set, or per label layouts for an existing image given a new label, and is also capable of detecting unusual layouts, potentially providing a way to evaluate layout generation problem.

GRASS: Generative Recursive Autoencoders for Shape Structures

TLDR
A novel neural network architecture for encoding and synthesis of 3D shapes, particularly their structures, is introduced and it is demonstrated that without supervision, the network learns meaningful structural hierarchies adhering to perceptual grouping principles, produces compact codes which enable applications such as shape classification and partial matching, and supports shape synthesis and interpolation with significant variations in topology and geometry.

Fast and Flexible Indoor Scene Synthesis via Deep Convolutional Generative Models

TLDR
A new, fast and flexible pipeline for indoor scene synthesis that is based on deep convolutional generative models, and generates results that outperforms it and other state-of-the-art deep generative scene models in terms of faithfulness to training data and perceived visual quality.

Dual Graph Convolutional Network for Semantic Segmentation

TLDR
The Dual Graph Convolutional Network (DGCNet) models the global context of the input feature by modelling two orthogonal graphs in a single framework, which achieves state-of-the-art results on both Cityscapes and Pascal Context datasets.

StructureNet: Hierarchical Graph Networks for 3D Shape Generation

TLDR
StructureNet is introduced, a hierarchical graph network which can directly encode shapes represented as such n-ary graphs, and can be robustly trained on large and complex shape families and used to generate a great diversity of realistic structured shape geometries.

The shape variational autoencoder: A deep generative model of part‐segmented 3D objects

TLDR
Qualitatively it is demonstrated that the ShapeVAE produces plausible shape samples, and that it captures a semantically meaningful shape‐embedding, and it is shown that the model facilitates mesh reconstruction by sampling consistent surface normals.

Graph R-CNN for Scene Graph Generation

TLDR
A novel scene graph generation model called Graph R-CNN, that is both effective and efficient at detecting objects and their relations in images, is proposed and a new evaluation metric is introduced that is more holistic and realistic than existing metrics.

PartNet: A Large-Scale Benchmark for Fine-Grained and Hierarchical Part-Level 3D Object Understanding

  • Kaichun MoShilin Zhu Hao Su
  • Computer Science
    2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2019
TLDR
This work presents PartNet, a consistent, large-scale dataset of 3D objects annotated with fine-grained, instance-level, and hierarchical 3D part information, and proposes a baseline method for part instance segmentation that is superior performance over existing methods.

MC-GAN: Multi-conditional Generative Adversarial Network for Image Synthesis

TLDR
A multi-conditional GAN (MC-GAN) which controls both the object and background information jointly is proposed which enables MC-GAN to generate a realistic object image with the desired background by controlling the amount of the background information from the given base image using the foreground Information from the text attributes.

Semantics Disentangling for Text-To-Image Generation

TLDR
A novel photo-realistic text-to-image generation model that implicitly disentangles semantics to both fulfill the high- level semantic consistency and low-level semantic diversity and a visual-semantic embedding strategy by semantic-conditioned batch normalization to find diverse low- level semantics.