StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks

@article{Zhang2017StackGANTT,
  title={StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks},
  author={Han Zhang and Tao Xu and Hongsheng Li and Shaoting Zhang and Xiaogang Wang and Xiaolei Huang and Dimitris N. Metaxas},
  journal={2017 IEEE International Conference on Computer Vision (ICCV)},
  year={2017},
  pages={5908-5916}
}
Synthesizing high-quality images from text descriptions is a challenging problem in computer vision and has many practical applications. [...] Key Method We decompose the hard problem into more manageable sub-problems through a sketch-refinement process. The Stage-I GAN sketches the primitive shape and colors of the object based on the given text description, yielding Stage-I low-resolution images.Expand
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
TLDR
Extensive experiments demonstrate that the proposed stacked generative adversarial networks significantly outperform other state-of-the-art methods in generating photo-realistic images. Expand
Hierarchically-Fused Generative Adversarial Network for Text to Realistic Image Synthesis
TLDR
A novel Hierarchically-fused Generative Adversarial Network (HfGAN) for synthesizing realistic images from text descriptions that is more efficient and noticeably outperforms the previous state-of-the-art methods. Expand
Conditional Image Synthesis Using Stacked Auxiliary Classifier Generative Adversarial Networks
TLDR
Both quantitative and qualitative analysis prove the proposed state-of-the-art generative model is capable of generating diverse and realistic images. Expand
Text to Image Synthesis Using Stacked Generative Adversarial Networks
Motivation. Human beings are quickly able to conjure and imagine images related to natural language descriptions. For example, when you read a story about a sunny field full of flowers, an image of aExpand
MRP-GAN: Multi-resolution parallel generative adversarial networks for text-to-image synthesis
TLDR
The Multi-resolution Parallel Generative Adversarial Networks for Text-to-Image Synthesis (MRP-GAN) is proposed to generate photographic images with low-resolution semantics and an attention mechanism, named Residual Attention Network, to fine-tune more fine-grained details of the generated images. Expand
High-Quality Facial Photo-Sketch Synthesis Using Multi-Adversarial Networks
TLDR
A novel synthesis framework called Photo-Sketch Synthesis using Multi-Adversarial Networks, (PS2-MAN) that iteratively generates low resolution to high resolution images in an adversarial way that leverages the pair information using CycleGAN framework. Expand
Paired-D GAN for Semantic Image Synthesis
TLDR
Experimental results show that Paired-GAN is capable of semantically synthesizing images to match an input text description while retaining the background in a source image against the state-of-the-art methods. Expand
Text to Image Synthesis With Bidirectional Generative Adversarial Network
TLDR
This paper proposes two semantics-enhanced modules and a novel Textual-Visual Bidirectional Generative Adversarial Network (TVBi-GAN), which improves consistency of synthesized images by involving precisely semantic features. Expand
FA-GAN: Feature-Aware GAN for Text to Image Synthesis
TLDR
Feature-Aware Generative Adversarial Network (FA-GAN) is proposed to synthesize a high-quality image by integrating two techniques: a self-supervised discriminator and a feature-aware loss. Expand
VITAL: A Visual Interpretation on Text with Adversarial Learning for Image Labeling
In this paper, we propose a novel way to interpret text information by extracting visual feature presentation from multiple high-resolution and photo-realistic synthetic images generated byExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 49 REFERENCES
Generative Adversarial Text to Image Synthesis
TLDR
A novel deep architecture and GAN formulation is developed to effectively bridge advances in text and image modeling, translating visual concepts from characters to pixels. Expand
Generative Image Modeling Using Style and Structure Adversarial Networks
TLDR
This paper factorize the image generation process and proposes Style and Structure Generative Adversarial Network, a model that is interpretable, generates more realistic images and can be used to learn unsupervised RGBD representations. Expand
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
  • C. Ledig, Lucas Theis, +6 authors W. Shi
  • Computer Science, Mathematics
  • 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2017
TLDR
SRGAN, a generative adversarial network (GAN) for image super-resolution (SR), is presented, to its knowledge, the first framework capable of inferring photo-realistic natural images for 4x upscaling factors and a perceptual loss function which consists of an adversarial loss and a content loss. Expand
Learning What and Where to Draw
TLDR
This work proposes a new model, the Generative Adversarial What-Where Network (GAWWN), that synthesizes images given instructions describing what content to draw in which location, and shows high-quality 128 x 128 image synthesis on the Caltech-UCSD Birds dataset. Expand
Image-to-Image Translation with Conditional Adversarial Networks
TLDR
Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Expand
Conditional Image Synthesis with Auxiliary Classifier GANs
TLDR
A variant of GANs employing label conditioning that results in 128 x 128 resolution image samples exhibiting global coherence is constructed and it is demonstrated that high resolution samples provide class information not present in low resolution samples. Expand
Neural Photo Editing with Introspective Adversarial Networks
TLDR
The Neural Photo Editor is presented, an interface that leverages the power of generative neural networks to make large, semantically coherent changes to existing images, and the Introspective Adversarial Network is introduced, a novel hybridization of the VAE and GAN. Expand
Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks
TLDR
A generative parametric model capable of producing high quality samples of natural images using a cascade of convolutional networks within a Laplacian pyramid framework to generate images in a coarse-to-fine fashion. Expand
Generating Interpretable Images with Controllable Structure
TLDR
Improved text-to-image synthesis with controllable object locations using an extension of Pixel Convolutional Neural Networks (PixelCNN) and it is shown how the model can generate images conditioned on part keypoints and segmentation masks. Expand
Improved Techniques for Training GANs
TLDR
This work focuses on two applications of GANs: semi-supervised learning, and the generation of images that humans find visually realistic, and presents ImageNet samples with unprecedented resolution and shows that the methods enable the model to learn recognizable features of ImageNet classes. Expand
...
1
2
3
4
5
...