Cross-scene crowd counting via deep convolutional neural networks

  title={Cross-scene crowd counting via deep convolutional neural networks},
  author={Cong Zhang and Hongsheng Li and Xiaogang Wang and Xiaokang Yang},
  journal={2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
Cross-scene crowd counting is a challenging task where no laborious data annotation is required for counting people in new target surveillance crowd scenes unseen in the training set. [] Key Method This proposed switchable learning approach is able to obtain better local optimum for both objectives. To handle an unseen target crowd scene, we present a data-driven method to fine-tune the trained CNN model for the target scene.

Figures and Tables from this paper

Cross-Scene Crowd Counting via FCN and Gaussian Model
A Full Convolutional Neutral Network for person detection, which has a reliable performance for cross-scene application and establishes a weighted adaptive human Gaussian model according to the different sensitivities for different human part in the map.
Switching Convolutional Neural Network for Crowd Counting
A novel crowd counting model that maps a given crowd scene to its density and switch convolutional neural network that leverages variation of crowd density within an image to improve the accuracy and localization of the predicted crowd count is proposed.
An Empirical Evaluation of Cross-scene Crowd Counting Performance
This work focuses on real-world, challenging application scenarios when no annotated crowd images from a given target scene are available, and evaluates the cross-scene effectiveness of several regression-based state-of-the-art crowd counting methods, including CNN-based ones, through extensive cross-data set experiments.
Crowd counting with crowd attention convolutional neural network
Crossing-Line Crowd Counting with Two-Phase Deep Neural Networks
This paper proposes a deep Convolutional Neural Network for counting the number of people across a line-of-interest (LOI) in surveillance videos and shows that the proposed method is robust to variations of crowd density, crowd velocity, and directions of the LOI, and outperforms state- of-the-art LOI counting methods.
End-to-end crowd counting via joint learning local and global count
  • C. Shang, H. Ai, Bo Bai
  • Computer Science
    2016 IEEE International Conference on Image Processing (ICIP)
  • 2016
An end-to-end convolutional neural network architecture that takes a whole image as its input and directly outputs the counting result, taking advantages of contextual information when predicting both local and global count is proposed.
Rich Convolutional Features Fusion for Crowd Counting
This work proposes a CNN architecture based on the fully convolutional network, which is used to build an end-to-end density map estimation system by combining some of the meaningful Convolutional features.
Pyramid-dilated deep convolutional neural network for crowd counting
A pyramid-dilated deep convolutional neural network for accurate crowd counting called PDD-CNN, based on a VGG-16 network, which produces high-quality density maps and achieves a good counting performance.
CrowdNet: A Deep Convolutional Network for Dense Crowd Counting
This work uses a combination of deep and shallow, fully convolutional networks to predict the density map for a given crowd image, and shows that this combination is used for effectively capturing both the high-level semantic information and the low-level features, necessary for crowd counting under large scale variations.


Deeply learned attributes for crowded scene understanding
This study develops a multi-task deep model to jointly learn and combine appearance and motion features for crowd understanding and proposes crowd motion channels as the input of the deep model and the channel design is inspired by generic properties of crowd systems.
Fully Convolutional Neural Networks for Crowd Segmentation
In this paper, we propose a fast fully convolutional neural network (FCNN) for crowd segmentation. By replacing the fully connected layers in CNN with 1 by 1 convolution kernels, FCNN takes whole
From Semi-supervised to Transfer Counting of Crowds
This study proposes a unified active and semi-supervised regression framework with ability to perform transfer learning, by exploiting the underlying geometric structure of crowd patterns via manifold analysis.
Crowd Counting Using Multiple Local Features
An approach that uses local features to count the number of people in each foreground blob segment, so that the total crowd estimate is the sum of the group sizes is proposed.
Deep Learning of Scene-Specific Classifier for Pedestrian Detection
A deep model is proposed to automatically learn scene-specific features and visual patterns in static video surveillance without any manual labels from the target scene to bridge the appearance gap.
Multi-source Multi-scale Counting in Extremely Dense Crowd Images
This work relies on multiple sources such as low confidence head detections, repetition of texture elements, and frequency-domain analysis to estimate counts, along with confidence associated with observing individuals, in an image region, and employs a global consistency constraint on counts using Markov Random Field.
DeepReID: Deep Filter Pairing Neural Network for Person Re-identification
A novel filter pairing neural network (FPNN) to jointly handle misalignment, photometric and geometric transforms, occlusions and background clutter is proposed and significantly outperforms state-of-the-art methods on this dataset.
Data-driven crowd analysis in videos
A new crowd analysis algorithm powered by behavior priors that are learned on a large database of crowd videos gathered from the Internet that performs like state-of-the-art methods for tracking people having common crowd behaviors and outperforms the methods when the tracked individual behaves in an unusual way.
Density-aware person detection and tracking in crowds
This work addresses the problem of person detection and tracking in crowded video scenes by exploring constraints imposed by the crowd density and formulate person detection as the optimization of a joint energy function combining crowd density estimation and the localization of individual people.
Feature Mining for Localised Crowd Counting
This paper presents a single regression model based approach that is able to estimate people count in spatially localised regions and is more scalable without the need for training a large number of regressors proportional to the number of local regions.