LatentKeypointGAN: Controlling Images via Latent Keypoints - Extended Abstract

  title={LatentKeypointGAN: Controlling Images via Latent Keypoints - Extended Abstract},
  author={Xingzhe He and Bastian Wandt and Helge Rhodin},
Abstract Generative adversarial networks (GANs) can now generate photo-realistic images. However, how to best control the image content remains an open challenge. We introduce LatentKeypointGAN, a two-stage GAN internally conditioned on a set of keypoints and associated appearance embeddings providing control of the position and style of the generated objects and their respective parts. A major difficulty that we address is disentangling the image into spatial and appearance factors with little… 

Figures and Tables from this paper


Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing
StyleMapGAN is proposed: the intermediate latent space has spatial dimensions, and a spatially variant modulation replaces AdaIN that makes the embedding through an encoder more accurate than existing optimization-based methods while maintaining the properties of GANs.
Disentangled Image Generation Through Structured Noise Injection
It is shown that disentanglement in the first layer of the generator network leads to disentangling the latent space in the generated image, and through a grid-based structure, several aspects of disentangled without complicating the network architecture and without requiring labels are achieved.
A Style-Based Generator Architecture for Generative Adversarial Networks
An alternative generator architecture for generative adversarial networks is proposed, borrowing from style transfer literature, that improves the state-of-the-art in terms of traditional distribution quality metrics, leads to demonstrably better interpolation properties, and also better disentangles the latent factors of variation.
High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
A new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs) is presented, which significantly outperforms existing methods, advancing both the quality and the resolution of deep image synthesis and editing.
Analyzing and Improving the Image Quality of StyleGAN
This work redesigns the generator normalization, revisit progressive growing, and regularize the generator to encourage good conditioning in the mapping from latent codes to images, and thereby redefines the state of the art in unconditional image modeling.
Unsupervised Learning of Object Landmarks through Conditional Image Generation
This work proposes a method for learning landmark detectors for visual objects (such as the eyes and the nose in a face) without any manual supervision and introduces a tight bottleneck in the geometry-extraction process that selects and distils geometry-related features.
Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation
This work presents a novel hierarchical adaptive Diagonal spatial ATtention (DAT) layers to separately manipulate the spatial contents from styles in a hierarchical manner and confirms that the proposed method not only outperforms the existing models in disentanglement scores, but also provides more flexible control over spatial features in the generated images.
Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?
We propose an efficient algorithm to embed a given image into the latent space of StyleGAN. This embedding enables semantic image editing operations that can be applied to existing photographs.
Editing in Style: Uncovering the Local Semantics of GANs
A simple and effective method for making local, semantically-aware edits to a target output image via a novel manipulation of style vectors that relies on the emergent disentanglement of semantic objects learned by StyleGAN during its training.
SEAN: Image Synthesis With Semantic Region-Adaptive Normalization
We propose semantic region-adaptive normalization (SEAN), a simple but effective building block for Generative Adversarial Networks conditioned on segmentation masks that describe the semantic