Beyond Context: Exploring Semantic Similarity for Tiny Face Detection

  title={Beyond Context: Exploring Semantic Similarity for Tiny Face Detection},
  author={Yue Xi and Jiangbin Zheng and Xiangjian He and Wenjing Jia and Hanhui Li},
  journal={2018 25th IEEE International Conference on Image Processing (ICIP)},
  • Yue XiJiangbin Zheng Hanhui Li
  • Published 5 March 2018
  • Computer Science
  • 2018 25th IEEE International Conference on Image Processing (ICIP)
Tiny face detection aims to find faces with high degrees of variability in scale, resolution and occlusion in cluttered scenes. Due to the very little information available on tiny faces, it is not sufficient to detect them merely based on the information presented inside the tiny bounding boxes or their context. In this paper, we propose to exploit the semantic similarity among all predicted targets in each image to boost current face detectors. To this end, we present a novel framework to… 

Figures from this paper

See Clearly in the Distance: Representation Learning GAN for Low Resolution Object Recognition

A Representation Learning Generative Adversarial Network (RL-GAN) to generate super image representation that is optimized for recognition, which improves the classification results significantly, with 10–15% gain on average, compared with benchmark solutions.

Context Attention Module for Human Hand Detection

The proposed CA-FPN achieves state-of-the-art performance on two challenging hand detection datasets, i.e. the Oxford hand dataset and the Vision for Intelligent and Applications (VIVA) Challenge dataset.



WIDER FACE: A Face Detection Benchmark

There is a gap between current face detection performance and the real world requirements, and the WIDER FACE dataset, which is 10 times larger than existing datasets is introduced, which contains rich annotations, including occlusions, poses, event categories, and face bounding boxes.

Detecting and Aligning Faces by Image Retrieval

This work presents a novel and robust exemplar-based face detector that integrates image retrieval and discriminative learning, and can detect faces under challenging conditions without explicitly modeling their variations.

Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks

A deep cascaded multitask framework that exploits the inherent correlation between detection and alignment to boost up their performance and achieves superior accuracy over the state-of-the-art techniques on the challenging face detection dataset and benchmark.

Face detection, pose estimation, and landmark localization in the wild

It is shown that tree-structured models are surprisingly effective at capturing global elastic deformation, while being easy to optimize unlike dense graph structures, in real-world, cluttered images.

CMS-RCNN: Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection

A face detection approach named Contextual Multi-Scale Region-based Convolution Neural Network (CMS-RCNN) to robustly solve the problems mentioned above and allows explicit body contextual reasoning in the network inspired from the intuition of human vision system.

FaceNet: A unified embedding for face recognition and clustering

A system that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure offace similarity, and achieves state-of-the-art face recognition performance using only 128-bytes perface.

A convolutional neural network cascade for face detection

This work proposes a cascade architecture built on convolutional neural networks (CNNs) with very powerful discriminative capability, while maintaining high performance, and introduces a CNN-based calibration stage after each of the detection stages in the cascade.

Deep Face Recognition

It is shown how a very large scale dataset can be assembled by a combination of automation and human in the loop, and the trade off between data purity and time is discussed.

SSD: Single Shot MultiBox Detector

The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component.

High-fidelity Pose and Expression Normalization for face recognition in the wild

A High-fidelity Pose and Expression Normalization (HPEN) method with 3D Morphable Model (3DMM) which can automatically generate a natural face image in frontal pose and neutral expression and an inpainting method based on Possion Editing to fill the invisible region caused by self occlusion is proposed.