Controllable Person Image Synthesis With Attribute-Decomposed GAN

  title={Controllable Person Image Synthesis With Attribute-Decomposed GAN},
  author={Yifang Men and Yiming Mao and Yuning Jiang and Wei-Ying Ma and Zhouhui Lian},
  journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  • Yifang Men, Yiming Mao, Zhouhui Lian
  • Published 27 March 2020
  • Computer Science
  • 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
This paper introduces the Attribute-Decomposed GAN, a novel generative model for controllable person image synthesis, which can produce realistic person images with desired human attributes (e.g., pose, head, upper clothes and pants) provided in various source inputs. The core idea of the proposed model is to embed human attributes into the latent space as independent codes and thus achieve flexible and continuous control of attributes via mixing and interpolation operations in explicit style… 

Pose and Attribute Consistent Person Image Synthesis

Extensive experimental results on the DeepFashion dataset demonstrate the superiority of the PAC-GAN method over the state-of-the-arts, especially for maintaining pose and attribute consistencies under large pose variations.

Cross Attention Based Style Distribution for Controllable Person Image Synthesis

A cross attention based style distribution module that computes between the source semantic styles and target pose for pose transfer and the effectiveness of the model is validated quantitatively and qualita-tively on poses transfer and virtual try-on tasks.

FitGAN: Fit- and Shape-Realistic Generative Adversarial Networks for Fashion

FitGAN is presented, a generative adversarial model that explicitly accounts for garments’ entangled size and shape characteristics of online fashion at scale, conditioned on the disentangled item representations and generates realistic images refit and shape properties of fashion articles.

Controllable Person Image Synthesis GAN and Its Reconfigurable Energy-efficient Hardware Implementation

  • Shaoyue LinYanjun Zhang
  • Computer Science
    2022 the 6th International Conference on Innovation in Artificial Intelligence (ICIAI)
  • 2022
This paper proposes a GAN network for person image synthesis that can generate high quality person image with controllable pose and attributes and designs a synthesizable library for GAN to pursue faster hardware reconfiguration.

Pose with style

The StyleGAN generator is extended so that it takes pose as input and introduces a spatially varying modulation for the latent space using the warped local features (for controlling appearances) and compares favorably against the state-of-the-art algorithms.

Smart Fashion: A Review of AI Applications in the Fashion & Apparel Industry

A structured task-based multi-label classification of fashion research articles provides researchers with explicit research directions and facilitates their access to the related studies, improving the visibility of studies simultaneously.

Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN

The StyleGAN generator is extended so that it takes pose as input and introduces a spatially varying modulation for the latent space using the warped local features (for controlling appearances) and compares favorably against the state-of-the-art algorithms.

Controllable Person Image Synthesis with Spatially-Adaptive Warped Normalization

A novel Spatially-Adaptive Warped Normalization (SAWN) is introduced, which integrates a learned flow-field to warp modulation parameters and proposes a novel self-training part replacement strategy to refine the pretrained model for the texture-transfer task, significantly improving the quality of the generated cloth and the preservation ability of irrelevant regions.

Do as we do: Multiple Person Video-To-Video Transfer

This work proposes a marker-less approach for multiple-person video-to-video transfer using pose as an intermediate representation for body motions and temporal consistency and is able to convincingly transfer body motion to the target video, while preserving specific features of thetarget video.

Style and Pose Control for Image Synthesis of Humans from a Single Monocular View

Fig. 1. We present StylePoseGAN, i.e., a new approach for synthesising photo-realistic novel views of a human from a single input image with explicit control over pose and per-body-part appearance.We



Progressive Pose Attention Transfer for Person Image Generation

A new generative adversarial network to the problem of pose transfer, i.e., transferring the pose of a given person to a target one, which can generate training images for person re-identification, alleviating data insufficiency.

U-Net: Convolutional Networks for Biomedical Image Segmentation

It is shown that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.

A Variational U-Net for Conditional Appearance and Shape Generation

A conditional U-Net is presented for shape-guided image generation, conditioned on the output of a variational autoencoder for appearance, trained end-to-end on images, without requiring samples of the same object with varying pose or appearance.

Disentangled Person Image Generation

A novel, two-stage reconstruction pipeline is proposed that learns a disentangled representation of the aforementioned image factors and generates novel person images at the same time and can manipulate the foreground, background and pose of the input image, and also sample new embedding features to generate targeted manipulations, that provide more control over the generation process.

Deformable GANs for Pose-Based Human Image Generation

This paper introduces deformable skip connections in the generator of the Generative Adversarial Network and proposes a nearest-neighbour loss instead of the common L1 and L2 losses in order to match the details of the generated image with the target image.

Pose Guided Person Image Generation

The novel Pose Guided Person Generation Network (PG$^2$) that allows to synthesize person images in arbitrary poses, based on an image of that person and a novel pose, is proposed.

Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization

This paper presents a simple yet effective approach that for the first time enables arbitrary style transfer in real-time, comparable to the fastest existing approach, without the restriction to a pre-defined set of styles.

Learning character-agnostic motion for motion retargeting in 2D

This paper presents a new method for retargeting video-captured motion between different human performers, without the need to explicitly reconstruct 3D poses and/or camera parameters, and demonstrates that this framework can be used to robustly extract human motion from videos, bypassing 3D reconstruction, and outperforming existing retargeted methods, when applied to videos in-the-wild.

Unsupervised Person Image Generation With Semantic Parsing Transformation

This paper proposes a new pathway to decompose the hard mapping into two more accessible subtasks, namely, semantic parsing transformation and appearance generation, and proposes a semantic generative network to transform between semantic parsing maps, in order to simplify the non-rigid deformation learning.