AttrLostGAN: Attribute Controlled Image Synthesis from Reconfigurable Layout and Style

@article{Frolov2021AttrLostGANAC,
  title={AttrLostGAN: Attribute Controlled Image Synthesis from Reconfigurable Layout and Style},
  author={Stanislav Frolov and Avneesh Sharma and J{\"o}rn Hees and Tushar Karayil and Federico Raue and Andreas R. Dengel},
  journal={ArXiv},
  year={2021},
  volume={abs/2103.13722}
}
Conditional image synthesis from layout has recently attracted much interest. Previous approaches condition the generator on object locations as well as class labels but lack fine-grained control over the diverse appearance aspects of individual objects. Gaining control over the image generation process is fundamental to build practical applications with a user-friendly interface. In this paper, we propose a method for attribute controlled image synthesis from layout which allows to specify the… 
DT2I: Dense Text-to-Image Generation from Region Descriptions
TLDR
D dense text-to-image (DT2I) synthesis is introduced as a new task to pave the way toward more intuitive image generation and DTC-GAN, a novel method to generate images from semantically rich region descriptions, and a multi-modal region feature matching loss to encourage semantic image-text matching.
Modeling Image Composition for Complex Scene Generation
We present a method that achieves state-of-the-art results on challenging (few-shot) layout-to-image generation tasks by accurately modeling textures, structures and relationships contained in a
Layered Controllable Video Generation
TLDR
This work introduces layered controllable video generation, where the initial frame of a video is decompose into foreground and background layers, with which the user can control the video generation process by simply manipulating the foreground mask.
Combining Transformer Generators with Convolutional Discriminators
TLDR
This paper studies the combination of a transformer-based generator and convolutional discriminator and successfully removes the need of the aforementioned required design choices and investigates the frequency spectrum properties of generated images to observe that the model retains the benefits of an attention based generator.

References

SHOWING 1-10 OF 55 REFERENCES
Attribute-Guided Image Generation from Layout
TLDR
A new image generation method that enables instance-level attribute control, and the generated images from this model have higher resolution, object classification accuracy and consistency, as compared to the previous state-of-the-art.
Image Generation From Layout
TLDR
The proposed Layout2Im model significantly outperforms the previous state of the art, boosting the best reported inception score by 24.66% and 28.57% on the very challenging COCO-Stuff and Visual Genome datasets, respectively.
Controlling Style and Semantics in Weakly-Supervised Image Generation
TLDR
A weakly-supervised approach for conditional image generation of complex scenes where a user has fine control over objects appearing in the scene and the ability of the model to manipulate a scene on complex datasets such as COCO and Visual Genome is showcased.
Image Synthesis From Reconfigurable Layout and Style
  • Wei Sun, Tianfu Wu
  • Computer Science
    2019 IEEE/CVF International Conference on Computer Vision (ICCV)
  • 2019
TLDR
This paper presents a layout- and style-based architecture for generative adversarial networks (termed LostGANs) that can be trained end-to-end to generate images from reconfigurable layout and style.
Learning Layout and Style Reconfigurable GANs for Controllable Image Synthesis
  • Wei Sun, Tianfu Wu
  • Computer Science
    IEEE transactions on pattern analysis and machine intelligence
  • 2021
TLDR
An intuitive paradigm for the task, layout-to-mask- to-image, which learns to unfold object masks in a weakly-supervised way based on an input layout and object style codes is proposed and a method built on Generative Adversarial Networks (GANs) is presented.
Object-Centric Image Generation from Layouts
TLDR
The idea that a model must be able to understand individual objects and relationships between objects in order to generate complex scenes well is started and an object-centric adaptation of the popular Fr{e}chet Inception Distance metric is introduced, that is better suited for multi-object images.
PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph
TLDR
This work proposes a semi-parametric method, PasteGAN, for generating the image from the scene graph and the image crops, where spatial arrangements of the objects and their pair-wise relationships are defined by the scene graphs and the object appearances are determined by the given object crops.
Generating Multiple Objects at Spatially Distinct Locations
TLDR
This work introduces a new approach which allows the location of arbitrarily many objects within an image by adding an object pathway to both the generator and the discriminator and shows that through the use of the object pathway, this approach can control object locations within images and can model complex scenes with multiple objects at various locations.
High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
TLDR
A new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs) is presented, which significantly outperforms existing methods, advancing both the quality and the resolution of deep image synthesis and editing.
Inferring Semantic Layout for Hierarchical Text-to-Image Synthesis
TLDR
This work proposes a novel hierarchical approach for text-to-image synthesis by inferring semantic layout and shows that the model can substantially improve the image quality, interpretability of output and semantic alignment to input text over existing approaches.
...
...