• Corpus ID: 4335253

Unsupervised Depth Estimation, 3D Face Rotation and Replacement

@article{Moniz2018UnsupervisedDE,
  title={Unsupervised Depth Estimation, 3D Face Rotation and Replacement},
  author={Joel Ruben Antony Moniz and Christopher Beckham and Simon Rajotte and Sina Honari and Christopher Joseph Pal},
  journal={ArXiv},
  year={2018},
  volume={abs/1803.09202}
}
We present an unsupervised approach for learning to estimate three dimensional (3D) facial structure from a single image while also predicting 3D viewpoint transformations that match a desired pose and facial geometry. We achieve this by inferring the depth of facial keypoints of an input image in an unsupervised manner, without using any form of ground-truth depth information. We show how it is possible to use these depths as intermediate computations within a new backpropable loss to predict… 

Figures and Tables from this paper

Learning to Restore 3D Face from In-the-Wild Degraded Images

A novel Learning to Restore 3D face framework for unsupervised high-quality face reconstruction from low-resolution images that outperforms state-of-the-art methods under the condition of low-quality inputs, and obtains superior performances than 2D pre-processed modelling approaches with limited 3D proxy.

Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection

A novel Learning to Aggregate and Personalize (LAP) framework for unsupervised robust 3D face modeling that recovers superior or competitive face shape and texture compared with state-of-the-art (SOTA) methods with or without prior and supervision.

Deep-MDS Framework for Recovering the 3D Shape of 2D Landmarks from a Single Image

A low parameter deep learning framework utilizing the Non-metric Multi-Dimensional scaling (NMDS) method, is proposed to recover the 3D shape of 2D landmarks on a human face, in a single input image, indicating the comparable performance of the proposed framework with the related state-of-the-art and powerful 3D reconstruction methods from the literature, in terms of accuracy and accuracy.

Rotate-and-Render: Unsupervised Photorealistic Face Rotation From Single-View Images

This work proposes a novel unsupervised framework that can synthesize photo-realistic rotated faces using only single-view image collections in the wild, and proves that rotating faces in the 3D space back and forth and re-rendering them to the 2D plane can serve as a strong self-supervision.

Unsupervised Learning of Probably Symmetric Deformable 3D Objects From Images in the Wild

We propose a method to learn 3D deformable object categories from raw single-view images, without external supervision. The method is based on an autoencoder that factors each input image into depth,

Photo-Geometric Autoencoding to Learn 3D Objects from Unlabelled Images

This work uses generative models to infer the 3D shape of object categories from raw single-view images, using no external supervision, and demonstrates superior accuracy compared to other methods that use supervision at the level of 2D image correspondences.

Temporal Representation Learning on Monocular Videos for 3D Human Pose Estimation.

This paper shows that applying contrastive loss only to the time-variant features and encouraging a gradual transition on them between nearby and away frames while also reconstructing the input, extract rich temporal features, well-suited for human pose estimation.

Predicting Forward & Backward Facial Depth Maps From a Single RGB Image For Mobile 3d AR Application

A novel deep learning based solution to predict robust depth maps of a face, one forward facing and the other backward facing, from a single image from the wild, by training a fully convolutional neural network to learn the dual depth maps.

Pose Registration of 3D Face Images

In this chapter, the need for registration is discussed, followed by different registration techniques, and some new pose detection techniques are discussed, which are mainly worked with 3D face data.

Unsupervised Learning on Monocular Videos for 3D Human Pose Estimation

This paper introduces an unsupervised feature extraction method that exploits contrastive self-supervised (CSS) learning to extract rich latent vectors from single-view videos, and shows that applying CSS only to the time-variant features yields a rich latent space well-suited for human pose estimation.

References

SHOWING 1-10 OF 40 REFERENCES

Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression

This work proposes to address many of these limitations by training a Convolutional Neural Network (CNN) on an appropriate dataset consisting of 2D images and 3D facial models or scans, and achieves this via a simple CNN architecture that performs direct regression of a volumetric representation of the3D facial geometry from a single 2D image.

Dense 3D face alignment from 2D video for real-time use

Towards Large-Pose Face Frontalization in the Wild

This work proposes a novel deep 3D Morphable Model (3DMM) conditioned Face Frontalization Generative Adversarial Network (GAN), termed as FF-GAN, to generate neutral head pose face images, which differs from both traditional GANs and 3DMM based modeling.

Dense 3D face alignment from 2D videos in real-time

A 3D cascade regression approach in which facial landmarks remain invariant across pose over a range of approximately 60 degrees is developed, which strongly support the validity of real-time, 3D registration and reconstruction from 2D video.

High-fidelity Pose and Expression Normalization for face recognition in the wild

A High-fidelity Pose and Expression Normalization (HPEN) method with 3D Morphable Model (3DMM) which can automatically generate a natural face image in frontal pose and neutral expression and an inpainting method based on Possion Editing to fill the invisible region caused by self occlusion is proposed.

Viewing Real-World Faces in 3D

  • Tal Hassner
  • Computer Science
    2013 IEEE International Conference on Computer Vision
  • 2013
An optimization process which seeks to maximize the similarity of appearances and depths, jointly, to those of a reference model is described, providing unique means of instant 3D viewing of faces appearing in web photos.

Adversarial Inverse Graphics Networks: Learning 2D-to-3D Lifting and Image-to-Image Translation from Unpaired Supervision

Adversarial Inverse Graphics networks (AIGNs) are proposed, weakly supervised neural network models that combine feedback from rendering their predictions, with distribution matching between their predictions and a collection of ground-truth factors, and outperform models supervised by only paired annotations.

Towards Pose Invariant Face Recognition in the Wild

Qualitative and quantitative experiments on both controlled and in-the-wild benchmarks demonstrate the superiority of the proposed Pose Invariant Model for face recognition in the wild over the state of thearts.

Disentangled Representation Learning GAN for Pose-Invariant Face Recognition

Quantitative and qualitative evaluation on both controlled and in-the-wild databases demonstrate the superiority of DR-GAN over the state of the art.

Effective face frontalization in unconstrained images

This work explores the simpler approach of using a single, unmodified, 3D surface as an approximation to the shape of all input faces, and shows that this leads to a straightforward, efficient and easy to implement method for frontalization.