Corpus ID: 238582746

Self-Supervised 3D Face Reconstruction via Conditional Estimation

@article{Wen2021SelfSupervised3F,
  title={Self-Supervised 3D Face Reconstruction via Conditional Estimation},
  author={Yandong Wen and Weiyang Liu and Bhiksha Raj and Rita Singh},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.04800}
}
  • Yandong Wen, Weiyang Liu, +1 author Rita Singh
  • Published 10 October 2021
  • Computer Science, Engineering
  • ArXiv
We present a conditional estimation (CEST) framework to learn 3D facial parameters from 2D single-view images by self-supervised training from videos. CEST is based on the process of analysis by synthesis, where the 3D facial parameters (shape, reflectance, viewpoint, and illumination) are estimated from the face image, and then recombined to reconstruct the 2D face image. In order to learn semantically meaningful 3D facial parameters without explicit access to their labels, CEST couples the… Expand
Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry
This work studies learning from a synergy process of 3D Morphable Models (3DMM) and 3D facial landmarks to predict complete 3D facial geometry, including 3D alignment, face orientation, and 3D faceExpand

References

SHOWING 1-10 OF 49 REFERENCES
Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency
TLDR
This work proposes an occlusion-aware view synthesis method to apply multi-view geometry consistency to self-supervised learning, and designs three novel loss functions for multi-View consistency, including the pixel consistency loss, the depth consistency lost, and the facial landmark-based epipolar loss. Expand
Learning to Regress 3D Face Shape and Expression From an Image Without 3D Supervision
TLDR
To train a network without any 2D-to-3D supervision, RingNet is presented, which learns to compute 3D face shape from a single image and achieves invariance to expression by representing the face using the FLAME model. Expand
Self-Supervised Multi-level Face Model Learning for Monocular Reconstruction at Over 250 Hz
TLDR
This first approach that jointly learns a regressor for face shape, expression, reflectance and illumination on the basis of a concurrently learned parametric face model is presented, which compares favorably to the state-of-the-art in terms of reconstruction quality, better generalizes to real world faces, and runs at over 250 Hz. Expand
FML: Face Model Learning From Videos
TLDR
This work proposes multi-frame video-based self-supervised training of a deep network that learns a face identity model both in shape and appearance while jointly learning to reconstruct 3D faces. Expand
Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression
TLDR
This work proposes to address many of these limitations by training a Convolutional Neural Network (CNN) on an appropriate dataset consisting of 2D images and 3D facial models or scans, and achieves this via a simple CNN architecture that performs direct regression of a volumetric representation of the3D facial geometry from a single 2D image. Expand
On Learning 3D Face Morphable Model from In-the-Wild Images
  • Luan Tran, Xiaoming Liu
  • Computer Science, Medicine
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2021
TLDR
This paper proposes an innovative framework to learn a nonlinear 3DMM model from a large set of in-the-wild face images, without collecting 3D face scans, and demonstrates the superior representation power of the nonlinear3DMM over its linear counterpart, and its contribution to face alignment, 3D reconstruction, and face editing. Expand
Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network
TLDR
A straightforward method that simultaneously reconstructs the 3D facial structure and provides dense alignment and surpasses other state-of-the-art methods on both reconstruction and alignment tasks by a large margin. Expand
Corrective 3D reconstruction of lips from monocular video
TLDR
This work quantitatively and qualitatively shows that the monocular approach reconstructs higher quality lip shapes, even for complex shapes like a kiss or lip rolling, than previous monocular approaches, and generalizes to new individuals and general scenes, enabling high-fidelity reconstruction even from commodity video footage. Expand
Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation
TLDR
An Image-to-Image translation network that jointly maps the input image to a depth image and a facial correspondence map can be utilized to provide high quality reconstructions of diverse faces under extreme expressions, using a purely geometric refinement process. Expand
Regressing Robust and Discriminative 3D Morphable Models with a Very Deep Neural Network
TLDR
A robust method for regressing discriminative 3D morphable face models (3DMM) using a convolutional neural network to regress 3DMM shape and texture parameters directly from an input photo is described. Expand
...
1
2
3
4
5
...