FML: Face Model Learning From Videos

@article{Tewari2019FMLFM,
  title={FML: Face Model Learning From Videos},
  author={Ayush Tewari and Florian Bernard and Pablo Garrido and Gaurav Bharaj and Mohamed A. Elgharib and Hans-Peter Seidel and Patrick P{\'e}rez and Michael Zollh{\"o}fer and Christian Theobalt},
  journal={2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2019},
  pages={10804-10814}
}
Monocular image-based 3D reconstruction of faces is a long-standing problem in computer vision. [] Key Method Our face model is learned using only corpora of in-the-wild video clips collected from the Internet. This virtually endless source of training data enables learning of a highly general 3D face model. In order to achieve this, we propose a novel multi-frame consistency loss that ensures consistent shape and appearance across multiple frames of a subject's face, thus minimizing depth ambiguity. At test…

Figures and Tables from this paper

Learning 3D Face Reconstruction with a Pose Guidance Network

A self-supervised learning approach to learning monocular 3D face reconstruction with a pose guidance network (PGN) that combines the complementary strengths of parametric model learning and data-driven learning techniques.

Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection

A novel Learning to Aggregate and Personalize (LAP) framework for unsupervised robust 3D face modeling that recovers superior or competitive face shape and texture compared with state-of-the-art (SOTA) methods with or without prior and supervision.

Deep Facial Non-Rigid Multi-View Stereo

This method optimizes the 3D face shape by explicitly enforcing multi-view appearance consistency, which is known to be effective in recovering shape details according to conventional multi- view stereo methods.

Implicit Neural Deformation for Sparse-View Face Reconstruction

To handle in-the-wild sparse-view input of the same target with different expressions at test time, this work proposes residual latent code to effectively expand the shape space of the learned implicit face representation as well as a novel view-switch loss to enforce consistency among different views.

Learning Complete 3D Morphable Face Models from Images and Videos

This work presents the first approach to learn complete 3D models of face identity and expression geometry, and reflectance, just from images and videos, and shows that the learned models better generalize and lead to higher quality image-based reconstructions than existing approaches.

Implicit Neural Deformation for Multi-View Face Reconstruction

To handle in-the-wild sparse-view input of the same target with different expressions at test time, a novel view-switch loss to enforce consistency among different views is proposed and residual latent code is proposed to effectively expand the shape space of the learned implicit face representation.

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency

This work proposes an occlusion-aware view synthesis method to apply multi-view geometry consistency to self-supervised learning, and designs three novel loss functions for multi-View consistency, including the pixel consistency loss, the depth consistency lost, and the facial landmark-based epipolar loss.

Survey on 3D face reconstruction from uncalibrated images

Learning Inverse Rendering of Faces from Real-world Videos

This paper proposes a weakly supervised training approach to train the model on real face videos, based on the assumption of consistency of albedo and normal across different frames, thus bridging the gap between real and synthetic face images.
...

References

SHOWING 1-10 OF 72 REFERENCES

Self-Supervised Multi-level Face Model Learning for Monocular Reconstruction at Over 250 Hz

This first approach that jointly learns a regressor for face shape, expression, reflectance and illumination on the basis of a concurrently learned parametric face model is presented, which compares favorably to the state-of-the-art in terms of reconstruction quality, better generalizes to real world faces, and runs at over 250 Hz.

3D Face Reconstruction by Learning from Synthetic Data

The proposed approach is based on a Convolutional-Neural-Network (CNN) which extracts the face geometry directly from its image and successfully recovers facial shapes from real images, even for faces with extreme expressions and under various lighting conditions.

High-Fidelity Monocular Face Reconstruction Based on an Unsupervised Model-Based Face Autoencoder

This work proposes a novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image and presents a stochastic vertex sampling technique for faster training of the networks.

Face reconstruction in the wild

This work addresses the problem of reconstructing 3D face models from large unstructured photo collections, e.g., obtained by Google image search or from personal photo collections in iPhoto, and leverages multi-image shading, but unlike traditional photometric stereo approaches, allows for changes in viewpoint and shape.

InverseFaceNet: Deep Single-Shot Inverse Face Rendering From A Single Image

This work proposes to recover high-quality facial pose, shape, expression, reflectance and illumination using a deep neural network that is trained using a large, synthetically created dataset and builds on a novel loss function that measures model-space similarity directly in parameter space and significantly improves reconstruction accuracy.

Learning Detailed Face Reconstruction from a Single Image

This work proposes to leverage the power of convolutional neural networks to produce a highly detailed face reconstruction from a single image, and introduces an end- to-end CNN framework which derives the shape in a coarse-to-fine fashion.

MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

A novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image and can be trained end-to-end in an unsupervised manner, which renders training on very large real world data feasible.

On Learning 3D Face Morphable Model from In-the-Wild Images

  • Luan TranXiaoming Liu
  • Computer Science
    IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2021
This paper proposes an innovative framework to learn a nonlinear 3DMM model from a large set of in-the-wild face images, without collecting 3D face scans, and demonstrates the superior representation power of the nonlinear3DMM over its linear counterpart, and its contribution to face alignment, 3D reconstruction, and face editing.

Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation

An Image-to-Image translation network that jointly maps the input image to a depth image and a facial correspondence map can be utilized to provide high quality reconstructions of diverse faces under extreme expressions, using a purely geometric refinement process.

Generating 3D faces using Convolutional Mesh Autoencoders

This work introduces a versatile model that learns a non-linear representation of a face using spectral convolutions on a mesh surface and shows that, replacing the expression space of an existing state-of-the-art face model with this model, achieves a lower reconstruction error.
...