Photographic Image Synthesis with Cascaded Refinement Networks

@article{Chen2017PhotographicIS,
  title={Photographic Image Synthesis with Cascaded Refinement Networks},
  author={Qifeng Chen and Vladlen Koltun},
  journal={2017 IEEE International Conference on Computer Vision (ICCV)},
  year={2017},
  pages={1520-1529}
}
  • Qifeng Chen, V. Koltun
  • Published 28 July 2017
  • Computer Science
  • 2017 IEEE International Conference on Computer Vision (ICCV)
We present an approach to synthesizing photographic images conditioned on semantic layouts. Given a semantic label map, our approach produces an image with photographic appearance that conforms to the input layout. The approach thus functions as a rendering engine that takes a two-dimensional semantic specification of the scene and produces a corresponding photographic image. Unlike recent and contemporaneous work, our approach does not rely on adversarial training. We show that photographic… 

Figures and Tables from this paper

On the Diversity of Conditional Image Synthesis With Semantic Layouts
TLDR
A novel approach to synthesize diverse realistic images corresponding to a semantic layout by introducing a diversity loss objective that maximizes the distance between synthesized image pairs and relates the input noise to the semantic segments in the synthesized images.
Photographic Image Synthesis with Highway Residual U-net
TLDR
A novel photographic image synthesis method based on the highway residual U-net (HRU), where the resize-convolution layers in HRU are used to replace the deconvolution layers to reduce the checkerboard artifacts in the synthesized images.
Enhancing Photorealism Enhancement
TLDR
This work presents an approach to enhancing the realism of synthetic images by a convolutional network that leverages intermediate representations produced by conventional rendering pipelines, trained via a novel adversarial objective, which provides strong supervision at multiple perceptual levels.
On the Diversity of Realistic Image Synthesis
TLDR
A novel approach to synthesize diverse realistic images corresponding to a semantic layout is presented, which maximizes the distance between synthesized image pairs and links the input noise to the semantic segments in the synthesized images.
Photographic Text-to-Image Synthesis with a Hierarchically-Nested Adversarial Network
TLDR
This paper introduces accompanying hierarchical-nested adversarial objectives inside the network hierarchies, which regularize mid-level representations and assist generator training to capture the complex image statistics.
Semi-Parametric Image Synthesis
TLDR
A semi-parametric approach to photographic image synthesis from semantic layouts that combines the complementary strengths of parametric and nonparametric techniques is presented.
Lab2Pix: Label-Adaptive Generative Adversarial Network for Unsupervised Image Synthesis
TLDR
This work proposes an unsupervised framework named Lab2Pix to adaptively synthesize images from labels by elegantly considering the particular properties of label to image synthesis task, and designs the generator in a cumulative style which gradually renders synthesized images by fusing features in different levels.
SketchyGAN: Towards Diverse and Realistic Sketch to Image Synthesis
  • W. Chen, James Hays
  • Computer Science
    2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • 2018
TLDR
This work proposes a novel Generative Adversarial Network approach that synthesizes plausible images from 50 categories including motorcycles, horses and couches and introduces a new network building block suitable for both the generator and discriminator which improves the information flow by injecting the input image at multiple scales.
Free View Synthesis
TLDR
This work presents a method for novel view synthesis from input images that are freely distributed around a scene that can synthesize images for free camera movement through the scene, and works for general scenes with unconstrained geometric layouts.
Semantic View Synthesis
TLDR
This work tackles a new problem of semantic view synthesis -- generating free-viewpoint rendering of a synthesized scene using a semantic label map as input to impose explicit constraints on the multiple-plane image (MPI) representation prediction process.
...
...

References

SHOWING 1-10 OF 54 REFERENCES
View Synthesis by Appearance Flow
TLDR
This work addresses the problem of novel view synthesis: given an input image, synthesizing new images of the same object or scene observed from arbitrary viewpoints and shows that for both objects and scenes, this approach is able to synthesize novel views of higher perceptual quality than previous CNN-based techniques.
Image Style Transfer Using Convolutional Neural Networks
TLDR
A Neural Algorithm of Artistic Style is introduced that can separate and recombine the image content and style of natural images and provide new insights into the deep image representations learned by Convolutional Neural Networks and demonstrate their potential for high level image synthesis and manipulation.
Perceptual Losses for Real-Time Style Transfer and Super-Resolution
TLDR
This work considers image transformation problems, and proposes the use of perceptual loss functions for training feed-forward networks for image transformation tasks, and shows results on image style transfer, where aFeed-forward network is trained to solve the optimization problem proposed by Gatys et al. in real-time.
Image-to-Image Translation with Conditional Adversarial Networks
TLDR
Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Generative Adversarial Text to Image Synthesis
TLDR
A novel deep architecture and GAN formulation is developed to effectively bridge advances in text and image modeling, translating visual concepts from characters to pixels.
Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks
TLDR
A novel approach that models future frames in a probabilistic manner is proposed, namely a Cross Convolutional Network to aid in synthesizing future frames; this network structure encodes image and motion information as feature maps and convolutional kernels, respectively.
Generative Visual Manipulation on the Natural Image Manifold
TLDR
This paper proposes to learn the natural image manifold directly from data using a generative adversarial neural network, and defines a class of image editing operations, and constrain their output to lie on that learned manifold at all times.
Deep Stereo: Learning to Predict New Views from the World's Imagery
TLDR
This work presents a novel deep architecture that performs new view synthesis directly from pixels, trained from a large number of posed image sets, and is the first to apply deep learning to the problem ofnew view synthesis from sets of real-world, natural imagery.
Context Encoders: Feature Learning by Inpainting
TLDR
It is found that a context encoder learns a representation that captures not just appearance but also the semantics of visual structures, and can be used for semantic inpainting tasks, either stand-alone or as initialization for non-parametric methods.
Learning What and Where to Draw
TLDR
This work proposes a new model, the Generative Adversarial What-Where Network (GAWWN), that synthesizes images given instructions describing what content to draw in which location, and shows high-quality 128 x 128 image synthesis on the Caltech-UCSD Birds dataset.
...
...