Perceptual Losses for Real-Time Style Transfer and Super-Resolution

  title={Perceptual Losses for Real-Time Style Transfer and Super-Resolution},
  author={Justin Johnson and Alexandre Alahi and Li Fei-Fei},
We consider image transformation problems, where an input image is transformed into an output image. [] Key Method We combine the benefits of both approaches, and propose the use of perceptual loss functions for training feed-forward networks for image transformation tasks. We show results on image style transfer, where a feed-forward network is trained to solve the optimization problem proposed by Gatys et al. in real-time. Compared to the optimization-based method, our network gives similar qualitative…

Figures and Tables from this paper

Learned perceptual image enhancement
This paper shows that adding a learned no-reference image quality metric to the loss can significantly improve enhancement operators and can be effective for tuning a variety of operators such as local tone mapping and dehazing.
Single Image Super-Resolution via Perceptual Loss Guided by Denoising Auto-Encoder
A new perceptual loss extracted from a pre-trained denoising auto-encoder with symmetric skip connections (SDAE) is designed, which has both better visual quality and higher PSNR and SSIM than the state-of-the-art methods.
Learning Linear Transformations for Fast Image and Video Style Transfer
This work presents an approach for universal style transfer that learns the transformation matrix in a data-driven fashion that is efficient yet flexible to transfer different levels of styles with the same auto-encoder network.
A novel perceptual loss function for single image super-resolution
A new perceptual loss function via combining features from multiple levels, which incorporates the discrepancy between the reconstruction and the ground truth in different structures is proposed, which can drive the same network to produce better results when used alone or combined with other loss functions.
SROBB: Targeted Perceptual Loss for Single Image Super-Resolution
A deep network-based decoder with a targeted objective function that penalizes images at different semantic levels using the corresponding terms is optimized, which results in more realistic textures and sharper edges and outperforms other state-of-the-art algorithms.
Learning image block statistics and quality assessment losses for perceptual image super-resolution
Experiments prove that the proposed loss function can effectively guide the network to generate images of high-perceptual quality while considering the structural distortion for single-image super-resolution.
EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis
This work proposes a novel application of automated texture synthesis in combination with a perceptual loss focusing on creating realistic textures rather than optimizing for a pixelaccurate reproduction of ground truth images during training to achieve a significant boost in image quality at high magnification ratios.
Mask-Guided Style Transfer Network for Purifying Real Images
This paper introduces the segmentation masks to construct RGB-mask pairs as inputs, then designs a mask-guided style transfer network to learn style features separately from the attention and bkgd(background) regions and learn content features from full and attention region, and proposes a novel region-level task-guided loss to restrain the features learnt from style and content.
Training a Task-Specific Image Reconstruction Loss
It is shown that a single natural image and corresponding distortions are sufficient to train a feature extractor that outperforms state-of-the-art loss functions in applications like single image super resolution, denoising, and JPEG artifact removal.
Learning Intrinsic Image Decomposition by Deep Neural Network with Perceptual Loss
Experimental results show that the model trained on synthetic single-object dataset can produce well decomposition results not only on synthetic images but also on real-world scene-level images (containing multiple objects).


Image Style Transfer Using Convolutional Neural Networks
A Neural Algorithm of Artistic Style is introduced that can separate and recombine the image content and style of natural images and provide new insights into the deep image representations learned by Convolutional Neural Networks and demonstrate their potential for high level image synthesis and manipulation.
Image Super-Resolution Using Deep Convolutional Networks
We propose a deep learning method for single image super-resolution (SR). Our method directly learns an end-to-end mapping between the low/high-resolution images. The mapping is represented as a deep
Single-Image Super-Resolution Using Sparse Regression and Natural Image Prior
  • K. Kim, Younghee Kwon
  • Computer Science, Mathematics
    IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2010
Compared with existing algorithms, KRR leads to a better generalization than simply storing the examples as has been done in existing example-based algorithms and results in much less noisy images.
Texture Networks: Feed-forward Synthesis of Textures and Stylized Images
This work proposes an alternative approach that moves the computational burden to a learning stage and trains compact feed-forward convolutional networks to generate multiple samples of the same texture of arbitrary size and to transfer artistic style from a given image to any other image.
Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks
Markovian Generative Adversarial Networks (MGANs) are proposed, a method for training generative networks for efficient texture synthesis that surpasses previous neural texture synthesizers by a significant margin and applies to texture synthesis, style transfer, and video stylization.
Recurrent Convolutional Neural Networks for Scene Labeling
This work proposes an approach that consists of a recurrent convolutional neural network which allows us to consider a large input context while limiting the capacity of the model, and yields state-of-the-art performance on both the Stanford Background Dataset and the SIFT FlowDataset while remaining very fast at test time.
Fast Image Super-Resolution Based on In-Place Example Regression
We propose a fast regression model for practical single image super-resolution based on in-place examples, by leveraging two fundamental super-resolution approaches- learning from an external
Fast and accurate image upscaling with super-resolution forests
This paper shows the close relation of previous work on single image super-resolution to locally linear regression and demonstrates how random forests nicely fit into this framework, and proposes to directly map from low to high-resolution patches using random forests.
Colorful Image Colorization
This paper proposes a fully automatic approach to colorization that produces vibrant and realistic colorizations and shows that colorization can be a powerful pretext task for self-supervised feature learning, acting as a cross-channel encoder.
Learning a Deep Convolutional Network for Image Super-Resolution
This work proposes a deep learning method for single image super-resolution (SR) that directly learns an end-to-end mapping between the low/high-resolution images and shows that traditional sparse-coding-based SR methods can also be viewed as a deep convolutional network.