• Corpus ID: 202789757

Deep Structured Prediction for Facial Landmark Detection

  title={Deep Structured Prediction for Facial Landmark Detection},
  author={Lisha Chen and Hui Su and Qiang Ji},
Existing deep learning based facial landmark detection methods have achieved excellent performance. These methods, however, do not explicitly embed the structural dependencies among landmark points. They hence cannot preserve the geometric relationships between landmark points or generalize well to challenging conditions or unseen data. This paper proposes a method for deep structured facial landmark detection based on combining a deep Convolutional Network with a Conditional Random Field. We… 

Figures and Tables from this paper

2D Wasserstein Loss for Robust Facial Landmark Detection

Robust Facial Landmark Detection via Heatmap-Offset Regression

A two-stage regression network for facial landmark detection on unconstrained conditions that realizes the heatmap-offset framework, which combines the outputs of heatmaps generated by SHN and coordinates estimated by GCN, to obtain an accurate prediction.

Exploiting Self-Supervised and Semi-Supervised Learning for Facial Landmark Tracking with Unlabeled Data

Experiments show that the proposed framework outperforms state-of-the-art semi-supervised facial landmark tracking methods, and also achieves advanced performance compared to fully supervised facial landmarktracking methods.

Attentive One-Dimensional Heatmap Regression for Facial Landmark Detection and Tracking

A novel attentive one-dimensional heatmap regression method for facial landmark localization that can output high-resolution 1D heatmaps despite limited GPU memory, significantly alleviating the quantization error.

3D to 4D Facial Expressions Generation Guided by Landmarks

This paper proposes a mesh encoder-decoder architecture (ExprED) that exploits a set of 3D landmarks to generate an expressive 3D face from its neutral counterpart and enables the 3D expression intensity to be continuously adapted from low to high intensity.

Deep Graph Pose: a semi-supervised deep graphical model for improved animal pose tracking

This work proposes a probabilistic graphical model built on top of deep neural networks, Deep Graph Pose (DGP), to leverage the rich spatiotemporal structures pervasive in behavioral video, and develops an efficient structured variational approach to perform inference in this model.

Sparse to Dense Dynamic 3D Facial Expression Generation

In this paper, we propose a solution to the task of generating dynamic 3D facial expressions from a neutral 3D face and an expression label. This involves solving two sub-problems: (i) modeling the

Unsupervised Part Segmentation through Disentangling Appearance and Shape

A bottleneck block is designed to squeeze and expand the appearance representation, leading to a more effective disentanglement between geometry and appearance, and combined with a self-supervised part classification loss and an improved geometry concentration constraint can segment more consistent parts with semantic meanings.

Review: Facial Anthropometric, Landmark Extraction, and Nasal Reconstruction Technology

Facial anthropometrics are measurements of human faces and are important figures that are used in many different fields, such as cosmetic surgery, protective gear design, reconstruction, etc.

Generating Complex 4D Expression Transitions by Learning Face Landmark Trajectories

A new model that generates transitions between different expressions, and synthesizes long and composed 4D expressions is proposed, and brings improvements with respect to previous solutions, while retaining good generalization to unseen data.



Recurrent 3D-2D Dual Learning for Large-Pose Facial Landmark Detection

A novel recurrent 3D-2D dual learning model that alternatively performs 2D-based 3D face model refinement and 3D to 2D projection based 2D landmark refinement to reliably reason about self-occluded landmarks, precisely capture the subtle landmark displacement and accurately detect landmarks even in presence of extremely large poses is introduced.

Convolutional Experts Constrained Local Model for 3D Facial Landmark Detection

To achieve best performance on the Menpo3D dense landmark detection challenge, a network that maps the output of CE-CLM to 84 landmarks called Adjustment Network, and a Deep Residual Network called Correction Networks that learns dataset specific corrections for CE- CLM are used.

Joint Cascade Face Detection and Alignment

The key idea is to combine face alignment with detection, observing that aligned face shapes provide better features for face classification and learns the two tasks jointly in the same cascade framework, by exploiting recent advances in face alignment.

Deep Structure Inference Network for Facial Action Unit Recognition

A deep neural architecture is proposed that combines learned local and global features in its initial stages and replicating a message passing algorithm between classes similar to a graphical model inference approach in later stages to improve state-of-the-art performance.

Robust Face Landmark Estimation under Occlusion

This work proposes a novel method, called Robust Cascaded Pose Regression (RCPR), which reduces exposure to outliers by detecting occlusions explicitly and using robust shape-indexed features, and shows that RCPR improves on previous landmark estimation methods on three popular face datasets.

Efficient object localization using Convolutional Networks

A novel architecture which includes an efficient `position refinement' model that is trained to estimate the joint offset location within a small region of the image to achieve improved accuracy in human joint location estimation is introduced.

The Menpo Facial Landmark Localisation Challenge: A Step Towards the Solution

A new benchmark for facial landmark localisation, contrary to the previous benchmarks, contains facial images both in (nearly) frontal, as well as in profile pose (annotated with a different markup of facial landmarks).

CRF-CNN: Modeling Structured Information in Human Pose Estimation

A CRF-CNN framework is proposed which can simultaneously model structural information in both output and hidden feature layers in a probabilistic way, and it is applied to human pose estimation and a neural network implementation of end-to-end learning CRf-CNN is provided.

Articulated Pose Estimation by a Graphical Model with Image Dependent Pairwise Relations

This work specifies a graphical model for human pose which exploits the fact the local image measurements can be used both to detect parts (or joints) and also to predict the spatial relationships between them (Image Dependent Pairwise Relations).

Pose-Invariant Face Alignment with a Single CNN

A new layer, named visualization layer, is proposed, which can be integrated into the CNN architecture and enables joint optimization with different loss functions and demonstrates state-of-the-art accuracy, while reducing the training time by more than half compared to the typical cascade of CNNs.