Localized region context and object feature fusion for people head detection

  title={Localized region context and object feature fusion for people head detection},
  author={Yule Li and Yong Dou and Xinwang Liu and Teng Li},
  journal={2016 IEEE International Conference on Image Processing (ICIP)},
  • Yule Li, Y. Dou, +1 author Teng Li
  • Published 2016
  • Computer Science
  • 2016 IEEE International Conference on Image Processing (ICIP)
People head detection in crowded scenes is challenging due to the large variability in clothing and appearance, small scales of people, and strong partial occlusions. Traditional bottom-up proposal methods and existing region proposal network approaches suffer from either poor recall or low precision. In this paper, we propose to improve both the recall and precision of head detection of region proposal models by integrating the local head information. In specific, we first use a region… Expand
Multi-person head segmentation in low resolution crowd scenes using convolutional encoder-decoder framework
This work proposes a multi-person head segmentation algorithm in crowded environments using a convolutional encoder-decoder network which is trained using head probability heatmaps and has demonstrated excellent performance on a challenging spectator crowd dataset. Expand
Detecting Heads using Feature Refine Net and Cascaded Multi-scale Architecture
A novel method, Feature Refine Net (FRN), and a cascaded multi-scale architecture to improve the performance of small head detection, and the proposed channel weighting method enables FRN to make use of features alternatively and effectively. Expand
HeadNet: An End-to-End Adaptive Relational Network for Head Detection
An effective adaptive relational network to capture context information, which is greatly helpful to suppress missed detection and achieve state-of-the-art results on two challenging datasets, i.e., HollywoodHeads and Brainwash. Expand
Scale Mapping and Dynamic Re-Detecting in Dense Head Detection
This paper investigates the influence of head scale and contextual information, and proposes a scale-invariant method for head detection that can dynamically detect heads depending on the complexity of the image. Expand
Head pose estimation with neural networks from surveillant images
This approach consists of two stages, head detection and pose estimation, and uses ResNet-50 as the backbone of the classifier, of which the input is the result of head detection. Expand
TCM: Temporal Consistency Model for Head Detection in Complex Videos
A temporal consistency model (TCM) is proposed to enhance the performance of a generic object detector by integrating spatial-temporal information that exists among subsequent frames of a particular video by recovering missed detection and suppressing false positives. Expand
Fully Convolutional Network for Crowd Size Estimation by Density Map and Counting Regression
  • B. Wu, Chun-Hsien Lin
  • Computer Science
  • 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
  • 2018
A counting-by-regression framework is employed, where the human head is modeled as a Guassian distribution, and a deeper and lighter fully convolutional network (FCN) is designed to be a crowd density map estimator. Expand
Representations, Analysis and Recognition of Shape and Motion from Imaging Data
This paper presents a comparison between two core paradigms for computing scene flow from multi-view videos of dynamic scenes. In both approaches, shape and motion estimation are decoupled, inExpand
Head mouse control system for people with disabilities


Context-Aware CNNs for Person Head Detection
This work leverage person-scene relations and propose a global CNN model trained to predict positions and scales of heads directly from the full image via energy-based model where the potentials are computed with a CNN framework. Expand
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%. Expand
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features. Expand
Edge Boxes: Locating Object Proposals from Edges
A novel method for generating object bounding box proposals using edges is proposed, showing results that are significantly more accurate than the current state-of-the-art while being faster to compute. Expand
Sample-Specific Late Fusion for Visual Category Recognition
This paper identifies the optimal fusion weights for each sample and pushes positive samples to top positions in the fusion score rank list, and forms the problem as a L∞ norm constrained optimization problem and applies the Alternating Direction Method of Multipliers for the optimization. Expand
Histograms of oriented gradients for human detection
  • N. Dalal, B. Triggs
  • Computer Science
  • 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05)
  • 2005
It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied. Expand
BING: Binarized normed gradients for objectness estimation at 300fps
To improve localization quality of the proposals while maintaining efficiency, a novel fast segmentation method is proposed and demonstrated its effectiveness for improving BING’s localization performance, when used in multi-thresholding straddling expansion (MTSE) post-processing. Expand
Object Detection with Discriminatively Trained Part Based Models
We describe an object detection system based on mixtures of multiscale deformable part models. Our system is able to represent highly variable object classes and achieves state-of-the-art results inExpand
OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks
This integrated framework for using Convolutional Networks for classification, localization and detection is the winner of the localization task of the ImageNet Large Scale Visual Recognition Challenge 2013 and obtained very competitive results for the detection and classifications tasks. Expand
End-to-End People Detection in Crowded Scenes
This work proposes a model that is based on decoding an image into a set of people detections, which takes an image as input and directly outputs aset of distinct detection hypotheses. Expand