Corpus ID: 237154246

A Latent Transformer for Disentangled Face Editing in Images and Videos

@inproceedings{Yao2021ALT,
  title={A Latent Transformer for Disentangled Face Editing in Images and Videos},
  author={Xu Yao and Alasdair Newson and Yann Gousseau and Pierre Hellier},
  year={2021}
}
High quality facial image editing is a challenging problem in the movie post-production industry, requiring a high degree of control and identity preservation. Previous works that attempt to tackle this problem may suffer from the entanglement of facial attributes and the loss of the person’s identity. Furthermore, many algorithms are limited to a certain task. To tackle these limitations, we propose to edit facial attributes via the latent space of a StyleGAN generator, by training a dedicated… Expand
Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model
  • Zipeng Xu, Tianwei Lin, +6 authors Errui Ding
  • Computer Science
  • 2021
TLDR
This paper proposes a novel framework, i.e., Predict, Prevent, and Evaluate (PPE), for disentangled text-driven image manipulation, which does not need manual annotation and thus is not limited to fixed manipulations. Expand

References

SHOWING 1-10 OF 45 REFERENCES
Interpreting the Latent Space of GANs for Semantic Face Editing
TLDR
This work proposes a novel framework, called InterFaceGAN, for semantic face editing by interpreting the latent semantics learned by GANs, and finds that the latent code of well-trained generative models actually learns a disentangled representation after linear transformations. Expand
AttGAN: Facial Attribute Editing by Only Changing What You Want
TLDR
The proposed method is extended for attribute style manipulation in an unsupervised manner and outperforms the state-of-the-art on realistic attribute editing with other facial details well preserved. Expand
LEED: Label-Free Expression Editing via Disentanglement
TLDR
An innovative label-free expression editing via disentanglement (LEED) framework that is capable of editing the expression of both frontal and profile facial images without requiring any expression label is presented. Expand
A Style-Based Generator Architecture for Generative Adversarial Networks
TLDR
An alternative generator architecture for generative adversarial networks is proposed, borrowing from style transfer literature, that improves the state-of-the-art in terms of traditional distribution quality metrics, leads to demonstrably better interpolation properties, and also better disentangles the latent factors of variation. Expand
High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
TLDR
A new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs) is presented, which significantly outperforms existing methods, advancing both the quality and the resolution of deep image synthesis and editing. Expand
STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing
Arbitrary attribute editing generally can be tackled by incorporating encoder-decoder and generative adversarial networks. However, the bottleneck layer in encoder-decoder usually gives rise toExpand
Fader Networks: Manipulating Images by Sliding Attributes
TLDR
A new encoder-decoder architecture that is trained to reconstruct images by disentangling the salient information of the image and the values of attributes directly in the latent space is introduced, which results in much simpler training schemes and nicely scales to multiple attributes. Expand
StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation
TLDR
The latent style space of Style-GAN2, a state-of-the-art architecture for image generation, is explored and StyleSpace, the space of channel-wise style parameters, is shown to be significantly more disentangled than the other intermediate latent spaces explored by previous works. Expand
Analyzing and Improving the Image Quality of StyleGAN
TLDR
This work redesigns the generator normalization, revisit progressive growing, and regularize the generator to encourage good conditioning in the mapping from latent codes to images, and thereby redefines the state of the art in unconditional image modeling. Expand
Large Scale GAN Training for High Fidelity Natural Image Synthesis
TLDR
It is found that applying orthogonal regularization to the generator renders it amenable to a simple "truncation trick," allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator's input. Expand
...
1
2
3
4
5
...