Corpus ID: 236447493

Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework

  title={Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework},
  author={Qingyu Song and Changan Wang and Zhengkai Jiang and Yabiao Wang and Ying Tai and Chengjie Wang and Jilin Li and Feiyue Huang and Yang Wu},
  • Qingyu Song, Changan Wang, +6 authors Yang Wu
  • Published 27 July 2021
  • Computer Science
  • ArXiv
Localizing individuals in crowds is more in accordance with the practical demands of subsequent high-level crowd analysis tasks than simply counting. However, existing localization based methods relying on intermediate representations (i.e., density maps or pseudo boxes) serving as learning targets are counter-intuitive and error-prone. In this paper, we propose a purely point-based framework for joint crowd counting and individual localization. For this framework, instead of merely reporting… Expand
CCTrans: Simplifying and Improving Crowd Counting with Transformer
This paper proposes a simple approach called CCTrans to simplify the design pipeline for crowd counting by utilizing a pyramid vision transformer backbone to capture the global crowd information, a pyramid feature aggregation (PFA) model to combine low-level and high-level features, and an efficient regression head with multi-scale dilated convolution (MDC) to predict density maps. Expand


Recurrent Attentive Zooming for Joint Crowd Counting and Precise Localization
This work proposes Recurrent Attentive Zooming Network, which recurrently detects ambiguous image region and zooms it into high resolution for re-inspection and proposes an adaptive fusion scheme that effectively elevates the performance. Expand
Point in, Box Out: Beyond Counting Persons in Crowds
This work proposes a new deep detection network that can simultaneously detect the size and location of human heads and count them in crowds, and proposes a curriculum learning strategy to train the network from images of relatively accurate and easy pseudo ground truth first. Expand
Decoupled Two-Stage Crowd Counting and Beyond
A novel decoupled two-stage counting (D2C) framework that sequentially regresses the probability map and learns a counter conditioned on the probabilitymap is proposed that addresses important data deficiency and sample imbalanced problems in counting. Expand
Bayesian Loss for Crowd Count Estimation With Point Supervision
This work proposes Bayesian loss, a novel loss function which constructs a density contribution probability model from the point annotations, and outperforms previous best approaches by a large margin on the latest and largest UCF-QNRF dataset. Expand
PaDNet: Pan-Density Crowd Counting
The proposed Pan-Density Network (PaDNet) achieves state-of-the-art recognition performance and high robustness in pan-density crowd counting. Expand
Learning Spatial Awareness to Improve Crowd Counting
A novel architecture called SPatial Awareness Network (SPANet) to incorporate spatial context for crowd counting is presented, and the Maximum Excess over Pixels (MEP) loss is proposed to achieve this by finding the pixel-level subregion with high discrepancy to the ground truth. Expand
Locate, Size, and Count: Accurately Resolving People in Dense Crowds via Detection
This work introduces a detection framework for dense crowd counting and eliminates the need for the prevalent density regression paradigm, and shows that LSC-CNN not only has superior localization than existing density regressors, but outperforms in counting as well. Expand
Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds
A novel approach is proposed that simultaneously solves the problems of counting, density map estimation and localization of people in a given dense crowd image and significantly outperforms state-of-the-art on the new dataset, which is the most challenging dataset with the largest number of crowd annotations in the most diverse set of scenes. Expand
Where are the Blobs: Counting by Localization with Point Supervision
This work proposes a detection-based method that does not need to estimate the size and shape of the objects and that outperforms regression-based methods and even outperforms those that use stronger supervision such as depth features, multi-point annotations, and bounding-box labels. Expand
Adaptive Mixture Regression Network with Local Counting Map for Crowd Counting
This work introduces a new target, named local counting map (LCM), to obtain more accurate results than density map based approaches, and proposes an adaptive mixture regression framework with three modules in a coarse-to-fine manner to further improve the precision of the crowd estimation. Expand