Deep Person Generation: A Survey from the Perspective of Face, Pose and Cloth Synthesis

  title={Deep Person Generation: A Survey from the Perspective of Face, Pose and Cloth Synthesis},
  author={Tong Sha and Wei Zhang and Tong Shen and Zhoujun Li and Tao Mei},
  journal={ACM Computing Surveys},
Deep person generation has attracted extensive research attention due to its wide applications in virtual agents, video conferencing, online shopping and art/movie production. With the advancement of deep learning, visual appearances (face, pose, cloth) of a person image can be easily generated on demand. In this survey, we first summarize the scope of person generation, and then systematically review recent progress and technical trends in identity-preserving deep person generation, covering… 

Human Image Generation: A Comprehensive Survey

This paper divides human image generation techniques into three paradigms, i.e., data-driven methods, knowledge-guided methods and hybrid methods, and summarizes the advantages and characteristics of different methods in terms of model architectures and input/output requirements.

FNeVR: Neural Volume Rendering for Face Animation

A Face Neural Volume Rendering (FNeVR) network to fully explore the potential of 2D motion warping and 3D volume rendering in a unified framework is proposed and a lightweight pose editor is designed, enabling FNeVR to edit the facial pose in a simple yet effective way.

Emotionally Controllable Talking Face Generation from an Arbitrary Emotional Portrait

With the continuous development of cross-modality generation, audio-driven talking face generation has made substantial advances in terms of speech content and mouth shape, but existing research on

Is More Realistic Better? A Comparison of Game Engine and GAN-based Avatars for Investigative Interviews of Children

The success of investigative interviews with maltreated children is often defined by the interviewer's ability to elicit a reliable and coherent account of the alleged incident from the child.

Synthesizing a Talking Child Avatar to Train Interviewers Working with Maltreated Children

When responding to allegations of child sexual, physical, and psychological abuse, Child Protection Service (CPS) workers and police personnel need to elicit detailed and accurate accounts of the

Motion Matters: Neural Motion Transfer for Better Camera Physiological Sensing

  • Computer Science
  • 2023
A neural video synthesis approach is adapted to augment videos for the task of remote photoplethysmography (PPG) and the effects of motion augmentation with respect to 1) the magnitude and 2) the type of motion are studied.



ClothFlow: A Flow-Based Model for Clothed Person Generation

ClothFlow is presented, an appearance-flow-based generative model to synthesize clothed person for posed-guided person image generation and virtual try-on and strong qualitative and quantitative results validate the effectiveness of the method.

Down to the Last Detail: Virtual Try-on with Detail Carving

A novel multi-stage framework to synthesize person images, where rich details in salient regions can be well preserved and a Tree-Block (tree dilated fusion block) to harness multi-scale features in the generator networks is proposed.

Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose

A deep neural network model that takes an audio signal A of a source person and a very short video V of a target person as input, and outputs a synthesized high-quality talking face video with personalized head pose, making use of the visual information in V, expression and lip synchronization.

Towards Multi-Pose Guided Virtual Try-On Network

This paper makes the first attempt towards a multi-pose guided virtual try-on system, which enables clothes to transfer onto a person with diverse poses, and significantly outperforms all state-of-the-art methods both qualitatively and quantitatively.

Audio-driven Talking Face Video Generation with Natural Head Pose

A deep neural network model is proposed that takes an audio signal A of a source person and a very short video V of a target person as input, and outputs a synthesized highquality talking face video with natural head pose, expression and lip synchronization, outperforming the state-of-the-art methods.

Human Appearance Transfer

The proposed architecture can be profiled to automatically generate images of a person dressed with different clothing transferred from a person in another image, opening paths for applications in entertainment and photo-editing, or affordable online shopping of clothing.

FashionOn: Semantic-guided Image-based Virtual Try-on with Detailed Human and Clothing Information

A novel FashionOn network to synthesize user images fitting different clothes in arbitrary poses to provide comprehensive information about how suitable the clothes are is proposed, which achieves the state-of-the-art virtual try-on performance both qualitatively and quantitatively.

Multistage Adversarial Losses for Pose-Based Human Image Synthesis

This paper proposes a pose-based human image synthesis method which can keep the human posture unchanged in novel viewpoints and adopt multistage adversarial losses separately for the foreground and background generation, which fully exploits the multi-modal characteristics of generative loss to generate more realistic looking images.

Pose-Guided Person Image Synthesis in the Non-Iconic Views

Extensive experiments show the efficacy of the proposed model that can tackle the problem of pose-guided person image generation from the non-iconic views, and a novel Region of Interest (RoI) perceptual loss is proposed to optimize the MR-Net.

Progressive Pose Attention Transfer for Person Image Generation

A new generative adversarial network to the problem of pose transfer, i.e., transferring the pose of a given person to a target one, which can generate training images for person re-identification, alleviating data insufficiency.