• Corpus ID: 239049745

StyleAlign: Analysis and Applications of Aligned StyleGAN Models

@article{Wu2021StyleAlignAA,
  title={StyleAlign: Analysis and Applications of Aligned StyleGAN Models},
  author={Zongze Wu and Yotam Nitzan and Eli Shechtman and D. Lischinski},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.11323}
}
In this paper, we perform an in-depth study of the properties and applications of aligned generative models. We refer to two models as aligned if they share the same architecture, and one of them (the child) is obtained from the other (the parent) via fine-tuning to another domain, a common practice in transfer learning. Several works already utilize some basic properties of aligned StyleGAN models to perform image-to-image translation. Here, we perform the first detailed exploration of model… 
StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators
TLDR
Leveraging the semantic power of large scale Contrastive-Language-Image-Pretraining (CLIP) models, this work presents a text-driven method that allows shifting a generative model to new domains, without having to collect even a single image.
HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing
TLDR
HyperStyle is proposed, a hypernetwork that learns to modulate StyleGAN’s weights to faithfully express a given image in editable regions of the latent space, and yields reconstructions comparable to those of optimization techniques with the near real-time inference capabilities of encoders.
Stitch it in Time: GAN-Based Facial Editing of Real Videos
The ability of Generative Adversarial Networks to encode rich semantics within their latent space has been widely adopted for facial image editing. However, replicating their success with videos has

References

SHOWING 1-10 OF 76 REFERENCES
StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators
TLDR
Leveraging the semantic power of large scale Contrastive-Language-Image-Pretraining (CLIP) models, this work presents a text-driven method that allows shifting a generative model to new domains, without having to collect even a single image.
LARGE: Latent-Based Regression through GAN Semantics
TLDR
A novel method for solving regression tasks using few-shot or weak supervision, which turns a pre-trained GAN into a regression model, using as few as two labeled samples, and shows that the same latent-distances can be used to sort collections of images by the strength of given attributes, even in the absence of explicit supervision.
StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation
TLDR
The latent style space of Style-GAN2, a state-of-the-art architecture for image generation, is explored and StyleSpace, the space of channel-wise style parameters, is shown to be significantly more disentangled than the other intermediate latent spaces explored by previous works.
Unsupervised Image-to-Image Translation via Pre-trained StyleGAN2 Network
TLDR
Both qualitative and quantitative evaluations were conducted to prove that the proposed I2I translation method can achieve outstanding performance in terms of image quality, diversity and semantic similarity to the input and reference images compared to state-of-the-art works.
Toward Multimodal Image-to-Image Translation
TLDR
This work aims to model a distribution of possible outputs in a conditional generative modeling setting that helps prevent a many-to-one mapping from the latent code to the output during training, also known as the problem of mode collapse.
Analyzing and Improving the Image Quality of StyleGAN
TLDR
This work redesigns the generator normalization, revisit progressive growing, and regularize the generator to encourage good conditioning in the mapping from latent codes to images, and thereby redefines the state of the art in unconditional image modeling.
Collaborative Learning for Faster StyleGAN Embedding
TLDR
This work proposes a novel collaborative learning framework that consists of an efficient embedding network and an optimization-based iterator, and shows that high-quality latent code can be obtained efficiently with a single forward pass through the embedded network.
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
TLDR
This work explores leveraging the power of recently introduced Contrastive Language-Image Pre-training (CLIP) models in order to develop a text-based interface for StyleGAN image manipulation that does not require such manual effort.
DRIT++: Diverse Image-to-Image Translation via Disentangled Representations
TLDR
This work presents an approach based on disentangled representation for generating diverse outputs without paired training images that can generate diverse and realistic images on a wide range of tasks without pairedTraining data.
StarGAN v2: Diverse Image Synthesis for Multiple Domains
TLDR
StarGAN v2, a single framework that tackles image-to-image translation models with limited diversity and multiple models for all domains, is proposed and shows significantly improved results over the baselines.
...
1
2
3
4
5
...