• Corpus ID: 239050227

MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation

@article{Liu2021MOSAL,
  title={MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation},
  author={Yepeng Liu and Zaiwang Gu and Shenghua Gao and Dong Wang and Yu Zeng and Jun Cheng},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.10953}
}
With the emergence of service robots and surveillance cameras, dynamic face recognition (DFR) in wild has received much attention in recent years. Face detection and head pose estimation are two important steps for DFR. Very often, the pose is estimated after the face detection. However, such sequential computations lead to higher latency. In this paper, we propose a low latency and lightweight network for simultaneous face detection, landmark localization and head pose estimation. Inspired by… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 54 REFERENCES
FLDet: A CPU Real-time Joint Face and Landmark Detector
TLDR
This paper proposes a novel single-shot detector for joint face detection and alignment, namely FLDet, with remarkable performance on both speed and accuracy, and introduces a new data augmentation strategy to take full usage of the face alignment dataset.
Facial Landmark Detection by Deep Multi-task Learning
TLDR
A novel tasks-constrained deep model is formulated, with task-wise early stopping to facilitate learning convergence and reduces model complexity drastically compared to the state-of-the-art method based on cascaded deep model.
HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition
TLDR
The proposed method, HyperFace, fuses the intermediate layers of a deep CNN using a separate CNN followed by a multi-task learning algorithm that operates on the fused features to exploit the synergy among the tasks which boosts up their individual performances.
Fine-Grained Head Pose Estimation Without Keypoints
TLDR
An elegant and robust way to determine pose is presented by training a multi-loss convolutional neural network on 300W-LP, a large synthetically expanded dataset, to predict intrinsic Euler angles directly from image intensities through joint binned pose classification and regression.
img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation
TLDR
Tests show that the proposed real-time, six degrees of freedom, 3D face pose estimation without face detection or landmark localization outperforms state of the art (SotA) face pose estimators and surpasses SotA models of comparable complexity on the WIDER FACE detection benchmark, despite not been optimized on bounding box labels.
Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks
TLDR
A deep cascaded multitask framework that exploits the inherent correlation between detection and alignment to boost up their performance and achieves superior accuracy over the state-of-the-art techniques on the challenging face detection dataset and benchmark.
Selective Refinement Network for High Performance Face Detection
TLDR
A novel single-shot face detector, named Selective Refinement Network (SRN), which introduces novel two-step classification and regression operations selectively into an anchor-based face detector to reduce false positives and improve location accuracy simultaneously.
WIDER FACE: A Face Detection Benchmark
TLDR
There is a gap between current face detection performance and the real world requirements, and the WIDER FACE dataset, which is 10 times larger than existing datasets is introduced, which contains rich annotations, including occlusions, poses, event categories, and face bounding boxes.
RetinaFace: Single-Shot Multi-Level Face Localisation in the Wild
TLDR
A novel single-shot, multi-level face localisation method, named RetinaFace, which unifies face box prediction, 2D facial landmark localisation and 3D vertices regression under one common target: point regression on the image plane.
PyramidBox: A Context-assisted Single Shot Face Detector
TLDR
By exploiting the value of context, PyramidBox achieves superior performance among the state-of-the-art over the two common face detection benchmarks, FDDB and WIDER FACE.
...
1
2
3
4
5
...