• Corpus ID: 202788797

Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis

@article{Liu2019LearningTP,
  title={Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis},
  author={Xihui Liu and Guojun Yin and Jing Shao and Xiaogang Wang and Hongsheng Li},
  journal={ArXiv},
  year={2019},
  volume={abs/1910.06809}
}
Semantic image synthesis aims at generating photorealistic images from semantic layouts. Previous approaches with conditional generative adversarial networks (GAN) show state-of-the-art performance on this task, which either feed the semantic label maps as inputs to the generator, or use them to modulate the activations in normalization layers via affine transformations. We argue that convolutional kernels in the generator should be aware of the distinct semantic labels at different locations… 

Figures, Tables, and Topics from this paper

Text to Image Generation with Semantic-Spatial Aware GAN
TLDR
A novel Semantic-Spatial Aware Convolution Network is introduced, which learns semantic-adaptive transformation conditioned on text to effectively fuse text features and image features, and learns a mask map in a weakly-supervised way that depends on the current text-image fusion process in order to guide the transformation spatially.
Dual Attention GANs for Semantic Image Synthesis
TLDR
A novel Dual Attention GAN (DAGAN) is proposed to synthesize photo-realistic and semantically-consistent images with fine details from the input layouts without imposing extra training overhead or modifying the network architectures of existing methods.
SEMANTIC IMAGE SYNTHESIS
Despite their recent successes, GAN models for semantic image synthesis still suffer from poor image quality when trained with only adversarial supervision. Historically, additionally employing the
Layout-to-Image Translation With Double Pooling Generative Adversarial Networks
  • Hao Tang, N. Sebe
  • Computer Science, Medicine
    IEEE Transactions on Image Processing
  • 2021
TLDR
This paper proposes a novel Double Pooling GAN (DPGAN) for generating photo-realistic and semantically-consistent results from the input layout, which consists of the Square-shape Pooling Module (SPM) and the Rectangle-shape pooling module (RPM), which aims to capture long-range semantic dependencies from both horizontal and vertical directions.
Semantic Image Analogy with a Conditional Single-Image GAN
TLDR
This work proposes a novel method to model the patch-level correspondence between semantic layout and appearance of a single image by training a single-image GAN that takes semantic labels as conditional input.
InjectionGAN: Unified Generative Adversarial Networks for Arbitrary Image Attribute Editing
TLDR
A new model based on a novel generative adversarial network (GAN) for arbitrary attribute transfer based on an auto-encoder-like network with multiple linear transformation and refinement connections is proposed, which helps to preserve the structural information while modify the appearance slightly at the pixel level through adversarial training.
Semantic Palette: Guiding Scene Generation with Class Proportions
TLDR
This work introduces a conditional framework with novel architecture designs and learning objectives, which effectively accommodates class proportions to guide the scene generation process and can produce layouts close to the real distribution, helping enhance the whole scenegeneration process.
Semantic Image Synthesis Manipulation for Stability Problem using Generative Adversarial Networks: A Survey
TLDR
This survey discussed the Generative Adversarial Networks (GANs) model because of the ability to synthesize good samples directly and a literature discussion between different methods used to improve the result of GAN have been discussed which aims to produce better results and generate more samples.
USIS: Unsupervised Semantic Image Synthesis
TLDR
This work proposes a new Unsupervised paradigm for Semantic Image Synthesis (USIS) and deploys a SPADE generator that learns to output images with visually separable semantic classes using a self-supervised segmentation loss, and proposes to use whole image wavelet-based discrimination.
Context-Aware Layout to Image Generation with Enhanced Object Appearance
  • Sen He, Wentong Liao, +4 authors T. Xiang
  • Computer Science
    2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2021
TLDR
A context-aware feature transformation module is introduced in the generator to ensure that the generated feature encoding of either object or stuff is aware of other coexisting objects/stuff in the scene.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 35 REFERENCES
High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
TLDR
A new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs) is presented, which significantly outperforms existing methods, advancing both the quality and the resolution of deep image synthesis and editing.
StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks
TLDR
This paper proposes Stacked Generative Adversarial Networks (StackGAN) to generate 256 photo-realistic images conditioned on text descriptions and introduces a novel Conditioning Augmentation technique that encourages smoothness in the latent conditioning manifold.
Self-Attention Generative Adversarial Networks
TLDR
The proposed SAGAN achieves the state-of-the-art results, boosting the best published Inception score from 36.8 to 52.52 and reducing Frechet Inception distance from 27.62 to 18.65 on the challenging ImageNet dataset.
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
TLDR
An Attentional Generative Adversarial Network that allows attention-driven, multi-stage refinement for fine-grained text-to-image generation and for the first time shows that the layered attentional GAN is able to automatically select the condition at the word level for generating different parts of the image.
Photographic Image Synthesis with Cascaded Refinement Networks
  • Qifeng Chen, V. Koltun
  • Computer Science
    2017 IEEE International Conference on Computer Vision (ICCV)
  • 2017
TLDR
It is shown that photographic images can be synthesized from semantic layouts by a single feedforward network with appropriate structure, trained end-to-end with a direct regression objective.
Image-to-Image Translation with Conditional Adversarial Networks
TLDR
Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
High-Fidelity Image Generation With Fewer Labels
TLDR
This work demonstrates how one can benefit from recent work on self- and semi-supervised learning to outperform the state of the art on both unsupervised ImageNet synthesis, as well as in the conditional setting.
Semantic Image Synthesis With Spatially-Adaptive Normalization
TLDR
S spatially-adaptive normalization is proposed, a simple but effective layer for synthesizing photorealistic images given an input semantic layout that allows users to easily control the style and content of image synthesis results as well as create multi-modal results.
Large Scale GAN Training for High Fidelity Natural Image Synthesis
TLDR
It is found that applying orthogonal regularization to the generator renders it amenable to a simple "truncation trick," allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator's input.
A Style-Based Generator Architecture for Generative Adversarial Networks
  • Tero Karras, S. Laine, Timo Aila
  • Computer Science, Mathematics
    2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2019
TLDR
An alternative generator architecture for generative adversarial networks is proposed, borrowing from style transfer literature, that improves the state-of-the-art in terms of traditional distribution quality metrics, leads to demonstrably better interpolation properties, and also better disentangles the latent factors of variation.
...
1
2
3
4
...