NAS-Count: Counting-by-Density with Neural Architecture Search

@inproceedings{Hu2020NASCountCW,
  title={NAS-Count: Counting-by-Density with Neural Architecture Search},
  author={Yutao Hu and Xiaolong Jiang and Xuhui Liu and Baochang Zhang and Jungong Han and Xianbin Cao and David S. Doermann},
  booktitle={ECCV},
  year={2020}
}
Most of the recent advances in crowd counting have evolved from hand-designed density estimation networks, where multi-scale features are leveraged to address the scale variation problem, but at the expense of demanding design efforts. In this work, we automate the design of counting models with Neural Architecture Search (NAS) and introduce an end-to-end searched encoder-decoder architecture, Automatic Multi-Scale Network (AMSNet). Specifically, we utilize a counting-specific two-level search… 
S4-Crowd: Semi-Supervised Learning with Self-Supervised Regularisation for Crowd Counting
TLDR
A semi-supervised learning framework S4-C Crowd is proposed, which can leverage both unlabeled/labeled data for robust crowd modelling and a crowd-driven recurrent unit Gated-Crowd-RecurrentUnit (GCRU), which can preserve discriminant crowd information by extracting second-order statistics, yielding pseudo labels with improved quality.
AutoScale: Learning to Scale for Crowd Counting
TLDR
A simple and effective Learning to Scale (L2S) module, which automatically scales dense regions into reasonable closeness levels (reflecting image-plane distance between neighboring people), and introduces a customized dynamic cross-entropy loss, significantly improving the localization-based model optimization.
Weight-Sharing Neural Architecture Search: A Battle to Shrink the Optimization Gap
TLDR
A literature review on the application of NAS to computer vision problems is provided and existing approaches are summarized into several categories according to their efforts in bridging the gap.
A Generalized Loss Function for Crowd Counting and Localization
TLDR
This paper investigates learning the density map representation through an unbalanced optimal transport problem, and proposes a generalized loss function to learn density maps for crowd counting and localization, proving that pixel-wise L2 loss and Bayesian loss are special cases and suboptimal solutions to the proposed loss function.
Alignment Enhancement Network for Fine-grained Visual Categorization
Fine-grained visual categorization (FGVC) aims to automatically recognize objects from different sub-ordinate categories. Despite attracting considerable attention from both academia and industry, it
Approaches on crowd counting and density estimation: a review
TLDR
The typical methods in this field of crowd counting are introduced and analyzed and the work of solving the problem of small-sample-based counting, dataset annotation methods and so on is discussed.
Boosting Crowd Counting with Transformers
TLDR
A pure transformer is used to extract features with global information from overlapping image patches to predict the total person count of the image through regression-token module (RTM) and proposes a tokenattention module (TAM) to recalibrate encoded features through channel-wise attention informed by the context token.
Counting People by Estimating People Flows
TLDR
This paper advocates estimating people flows across image locations between consecutive images and inferring the people densities from these flows instead of directly regressing them, which enables us to impose much stronger constraints encoding the conservation of the number of people.
...
1
2
3
...

References

SHOWING 1-10 OF 96 REFERENCES
Context-Aware Crowd Counting
TLDR
This paper introduces an end-to-end trainable deep architecture that combines features obtained using multiple receptive field sizes and learns the importance of each such feature at each image location, which yields an algorithm that outperforms state-of-the-art crowd counting methods, especially when perspective effects are strong.
Single-Image Crowd Counting via Multi-Column Convolutional Neural Network
TLDR
With the proposed simple MCNN model, the method outperforms all existing methods and experiments show that the model, once trained on one dataset, can be readily transferred to a new dataset.
Cross-scene crowd counting via deep convolutional neural networks
TLDR
A deep convolutional neural network is proposed for crowd counting, and it is trained alternatively with two related learning objectives, crowd density and crowd count, to obtain better local optimum for both objectives.
Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation
TLDR
This paper presents a network level search space that includes many popular designs, and develops a formulation that allows efficient gradient-based architecture search and demonstrates the effectiveness of the proposed method on the challenging Cityscapes, PASCAL VOC 2012, and ADE20K datasets.
Autodeeplab: Hierarchical neural architecture search for semantic image segmentation
  • In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
  • 2019
Contextaware crowd counting
  • In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 2019
  • 2019
PC-DARTS: Partial Channel Connections for Memory-Efficient Differentiable Architecture Search
TLDR
This paper presents a novel approach, namely, Partially-Connected DARTS, by sampling a small part of super-network to reduce the redundancy in exploring the network space, thereby performing a more efficient search without comprising the performance.
Scale Aggregation Network for Accurate and Efficient Crowd Counting
TLDR
A novel training loss, combining of Euclidean loss and local pattern consistency loss is proposed, which improves the performance of the model in the authors' experiments and achieves superior performance to state-of-the-art methods while with much less parameters.
Multi-source Multi-scale Counting in Extremely Dense Crowd Images
TLDR
This work relies on multiple sources such as low confidence head detections, repetition of texture elements, and frequency-domain analysis to estimate counts, along with confidence associated with observing individuals, in an image region, and employs a global consistency constraint on counts using Markov Random Field.
Attentional Neural Fields for Crowd Counting
TLDR
The CRFs coupled with the attention mechanism are seamlessly integrated into the encoder-decoder network, establishing an ANF that can be optimized end-to-end by back propagation, surpassing most previous methods.
...
1
2
3
4
5
...