Multiple Style Transfer via Variational AutoEncoder

  title={Multiple Style Transfer via Variational AutoEncoder},
  author={Zhi-Song Liu and Vicky S. Kalogeiton and Marie-Paule Cani},
Modern works on style transfer focus on transferring style from a single image. Recently, some approaches study multiple style transfer; these, however, are either too slow or fail to mix multiple styles. We propose ST-VAE, a Variational AutoEncoder for latent space-based style transfer. It performs multiple style transfer by projecting nonlinear styles to a linear latent space, enabling to merge styles via linear interpolation before transferring the new style to the content image. To evaluate… 

Figures from this paper

The Swiss Army Knife for Image-to-Image Translation: Multi-Task Diffusion Models
This work builds on a method for image-to-image translation using denoising diffusion implicit models and includes a regression problem and a segmentation problem for guiding the image generation to the desired output.
Name Your Style: An Arbitrary Artist-aware Image Style Transfer
This paper introduces a contrastive training strategy to effectively extract style descriptions from the image-text model (i.e., CLIP), which aligns stylization with the text description, and proposes a novel and efficient attention module that explores cross-attentions to fuse style and content features.


Multi-style Generative Network for Real-time Transfer
MSG-Net is the first to achieve real-time brush-size control in a purely feed-forward manner for style transfer and is compatible with most existing techniques including content-style interpolation, color-preserving, spatial control and brush stroke size control.
Universal Style Transfer via Feature Transforms
The key ingredient of the method is a pair of feature transforms, whitening and coloring, that are embedded to an image reconstruction network that reflects a direct matching of feature covariance of the content image to a given style image.
Conditional Fast Style Transfer Network
This paper extends the network proposed as a fast neural style transfer network so that the network can learn multiple styles at the same time, and shows that the proposed network can mix multiple styles, although the network is trained with each of the training styles independently.
Deep Photo Style Transfer
This paper introduces a deep-learning approach to photographic style transfer that handles a large variety of image content while faithfully transferring the reference style and constrain the transformation from the input to the output to be locally affine in colorspace.
Learning Linear Transformations for Fast Arbitrary Style Transfer
This work derives the form of transformation matrix theoretically and presents an arbitrary style transfer approach that learns the transformation matrix with a feed-forward network, which is highly efficient yet allows a flexible combination of multi-level styles while preserving content affinity during style transfer process.
Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization
This paper presents a simple yet effective approach that for the first time enables arbitrary style transfer in real-time, comparable to the fastest existing approach, without the restriction to a pre-defined set of styles.
Analyzing and Improving the Image Quality of StyleGAN
This work redesigns the generator normalization, revisit progressive growing, and regularize the generator to encourage good conditioning in the mapping from latent codes to images, and thereby redefines the state of the art in unconditional image modeling.
Distribution Aligned Multimodal and Multi-domain Image Stylization
Qualitative and quantitative comparisons with state-of-the-art methods demonstrate that the proposed unified framework for multimodal and multi-domain style transfer with the support of both exemplar-based reference and randomly sampled guidance can generate high-quality results.
Adversarial Latent Autoencoders
Autoencoder networks are unsupervised approaches aiming at combining generative and representational properties by learning simultaneously an encoder-generator map. Although studied extensively, the
Diverse Image-to-Image Translation via Disentangled Representations
This work presents an approach based on disentangled representation for producing diverse outputs without paired training images, and proposes to embed images onto two spaces: a domain-invariant content space capturing shared information across domains and adomain-specific attribute space.