Deep video portraits

  title={Deep video portraits},
  author={Hyeongwoo Kim and Pablo Garrido and Ayush Tewari and Weipeng Xu and Justus Thies and Matthias Nie{\ss}ner and Patrick P{\'e}rez and Christian Richardt and Michael Zollh{\"o}fer and Christian Theobalt},
  journal={ACM Transactions on Graphics (TOG)},
  pages={1 - 14}
We present a novel approach that enables photo-realistic re-animation of portrait videos using only an input video. In contrast to existing approaches that are restricted to manipulations of facial expressions only, we are the first to transfer the full 3D head position, head rotation, face expression, eye gaze, and eye blinking from a source actor to a portrait video of a target actor. The core of our approach is a generative neural network with a novel space-time architecture. The network… 

Photorealistic Audio-driven Video Portraits

A novel method to synthesize photorealistic video portraits for an input portrait video, automatically driven by a person's voice, using a parametric 3D face model represented by geometry, facial expression, illumination, etc., and a mapping from audio features to model parameters.

Neural Relighting and Expression Transfer On Video Portraits

A neural relighting and expression transfer technique to transfer the head pose and facial expressions from a source performer to a portrait video of a target performer while enabling dynamic relighting.

Dynamic Neural Portraits

The proposed architecture is different from existing methods that rely on GAN-based image-to-image translation networks for trans-forming renderings of 3D faces into photo-realistic images, building upon a 2D coordinate-based MLP with controllable dynamics.

Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation

This work presents a live system that generates personalized photorealistic talking-head animation only driven by audio signals at over 30 fps and synthesizes high-fidelity personalized facial details, e.g., wrinkles, teeth.

Neural Rendering and Reenactment of Human Actor Videos

The proposed method for generating video-realistic animations of real humans under user control relies on a video sequence in conjunction with a (medium-quality) controllable 3D template model of the person to generate a synthetically rendered version of the video.

Warp-guided GANs for single-photo facial animation

This paper introduces a novel method for realtime portrait animation in a single photo that factorizes out the nonlinear geometric transformations exhibited in facial expressions by lightweight 2D warps and leaves the appearance detail synthesis to conditional generative neural networks for high-fidelity facial animation generation.

Textured Neural Avatars

A system for learning full body neural avatars, i.e. deep networks that produce full body renderings of a person for varying body pose and varying camera pose, that is capable of learning to generate realistic renderings while being trained on videos annotated with 3D poses and foreground masks is presented.

TACR-Net: Editing on Deep Video and Voice Portraits

A novel deep learning framework, named Temporal-Refinement Autoregressive-Cascade Rendering Network (TACR-Net) for audio-driven dynamic talking face editing, which encodes facial expression blendshape based on the given acoustic features without separately training for special video.

RigNeRF: Fully Controllable Neural 3D Portraits

  • ShahRukh Athar
  • Computer Science
    2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2022
This work proposes RigNeRF, a system that goes beyond just novel view synthesis and enables full control of head pose and facial expressions learned from a single portrait video and demonstrates the effectiveness of the method on free view synthesis of a portrait scene with explicitHead pose and expression controls.

Geometry Driven Progressive Warping for One-Shot Face Animation

This work presents a geometry driven model and proposes two geometric patterns as guidance: 3D face rendered displacement maps and posed neural codes and proposes a progressive warping module that alternates between feature warping and displacement estimation at increasing resolutions.



Bringing portraits to life

A technique to automatically animate a still portrait, making it possible for the subject in the photo to come to life and express various emotions, and gives rise to reactive profiles, where people in still images can automatically interact with their viewers.

Face2Face: Real-Time Face Capture and Reenactment of RGB Videos

A novel approach for real-time facial reenactment of a monocular target video sequence (e.g., Youtube video) that addresses the under-constrained problem of facial identity recovery from monocular video by non-rigid model-based bundling and re-render the manipulated output video in a photo-realistic fashion.

Demo of Face2Face: real-time face capture and reenactment of RGB videos

A novel approach for real-time facial reenactment of a monocular target video sequence (e.g., Youtube video) that addresses the under-constrained problem of facial identity recovery from monocular video by non-rigid model-based bundling and re-render the manipulated output video in a photo-realistic fashion.

Dynamic 3D avatar creation from hand-held video input

This system faithfully recovers facial expression dynamics of the user by adapting a blendshape template to an image sequence of recorded expressions using an optimization that integrates feature tracking, optical flow, and shape from shading.

Realistic Dynamic Facial Textures from a Single Image Using GANs

A Deep Generative Network is trained that can infer realistic per-frame texture deformations of the target identity using the per-frames source textures and the single target texture, and can both animate the face and perform video face replacement on the source video using the target appearance.

Demo of FaceVR: real-time facial reenactment and eye gaze control in virtual reality

We introduce FaceVR, a novel method for gaze-aware facial reenactment in the Virtual Reality (VR) context. The key component of FaceVR is a robust algorithm to perform real-time facial motion capture

Real-time expression transfer for facial reenactment

The novelty of the approach lies in the transfer and photorealistic re-rendering of facial deformations and detail into the target video in a way that the newly-synthesized expressions are virtually indistinguishable from a real video.

High-fidelity facial and speech animation for VR HMDs

This work introduces a novel system for HMD users to control a digital avatar in real-time while producing plausible speech animation and emotional expressions and demonstrates the quality of the system on a variety of subjects and evaluates its performance against state-of-the-art real- time facial tracking techniques.

Reconstruction of Personalized 3D Face Rigs from Monocular Video

A novel approach for the automatic creation of a personalized high-quality 3D face rig of an actor from just monocular video data, based on three distinct layers that allow the actor's facial shape as well as capture his person-specific expression characteristics at high fidelity, ranging from coarse-scale geometry to fine-scale static and transient detail on the scale of folds and wrinkles.

Perspective-aware manipulation of portrait photos

This paper introduces a method to modify the apparent relative pose and distance between camera and subject given a single portrait photo, and builds a 2D warp in the image plane to approximate the effect of a desired change in 3D.