Corpus ID: 198179934

Compact Global Descriptor for Neural Networks

@article{He2019CompactGD,
  title={Compact Global Descriptor for Neural Networks},
  author={Xiangyu He and Ke Cheng and Qiang Chen and Qinghao Hu and Peisong Wang and Jian Cheng},
  journal={ArXiv},
  year={2019},
  volume={abs/1907.09665}
}
Long-range dependencies modeling, widely used in capturing spatiotemporal correlation, has shown to be effective in CNN dominated computer vision tasks. [...] Key Method This descriptor enables subsequent convolutions to access the informative global features with negligible computational complexity and parameters. Benchmark experiments show that the proposed method can complete state-of-the-art long-range mechanisms with a significant reduction in extra computing cost. Code available at https://github.com…Expand
PNL: Efficient Long-Range Dependencies Extraction with Pyramid Non-Local Module for Action Recognition
TLDR
Pyramid Non-Local (PNL) module is proposed, which extends the non-local block by incorporating regional correlation at multiple scales through a pyramid structured module, which upscales the effectiveness of non- local operation by attending to the interaction between different regions. Expand
A Spectral Nonlocal Block for Neural Networks
TLDR
A unified approach to interpreting nonlocal-based blocks is provided, where they are viewed as a graph filter generated on a fully-connected graph, which can be flexibly inserted into deep neural networks to catch the long-range dependencies between spatial pixels or temporal frames. Expand
Regularization on Spatio-Temporally Smoothed Feature for Action Recognition
TLDR
Experimental results show the improvement in generalization performance on a popular action recognition datasets demonstrating the effectiveness of RMS as a regularization technique, compared to other state-of-the-art regularization methods. Expand
Effective Action Recognition with Embedded Key Point Shifts
TLDR
This paper proposes a novel temporal feature extraction module, named Key Point Shifts Embedding Module ($KPSEM$), to adaptively extract channel-wise key point shifts across video frames without key point annotation for temporal features extraction. Expand
Unifying Nonlocal Blocks for Neural Networks
  • Lei Zhu, Qi She, +4 authors Changhu Wang
  • Computer Science
  • ArXiv
  • 2021
TLDR
By concerning the property of spectral, this paper proposes an efficient and robust spectral nonlocal block, which can be more robust and flexible to catch long-range dependencies when inserted into deep neural networks than the existing nonlocal blocks. Expand

References

SHOWING 1-10 OF 69 REFERENCES
Compact Generalized Non-local Network
TLDR
This extension utilizes the compact representation for multiple kernel functions with Taylor expansion that makes the generalized non-local module in a fast and low-complexity computation flow and implements the generalizednon-local method within channel groups to ease the optimization. Expand
A2-Nets: Double Attention Networks
TLDR
This work proposes the "double attention block", a novel component that aggregates and propagates informative global features from the entire spatio-temporal space of input images/videos, enabling subsequent convolution layers to access featuresFrom the entire space efficiently. Expand
Non-local Neural Networks
TLDR
This paper presents non-local operations as a generic family of building blocks for capturing long-range dependencies in computer vision and improves object detection/segmentation and pose estimation on the COCO suite of tasks. Expand
Multi-Scale Context Aggregation by Dilated Convolutions
TLDR
This work develops a new convolutional network module that is specifically designed for dense prediction, and shows that the presented context module increases the accuracy of state-of-the-art semantic segmentation systems. Expand
Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks
TLDR
This work proposes a simple, lightweight solution to the issue of limited context propagation in ConvNets, which propagates context across a group of neurons by aggregating responses over their extent and redistributing the aggregates back through the group. Expand
Understanding the Effective Receptive Field in Deep Convolutional Neural Networks
TLDR
The notion of an effective receptive fieldsize is introduced, and it is shown that it both has a Gaussian distribution and only occupies a fraction of the full theoretical receptive field size. Expand
Feature Pyramid Networks for Object Detection
TLDR
This paper exploits the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost and achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles. Expand
Very Deep Convolutional Networks for Large-Scale Image Recognition
TLDR
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. Expand
Squeeze-and-Excitation Networks
TLDR
This work proposes a novel architectural unit, which is term the “Squeeze-and-Excitation” (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels and shows that these blocks can be stacked together to form SENet architectures that generalise extremely effectively across different datasets. Expand
Deformable Convolutional Networks
TLDR
This work introduces two new modules to enhance the transformation modeling capability of CNNs, namely, deformable convolution and deformable RoI pooling, based on the idea of augmenting the spatial sampling locations in the modules with additional offsets and learning the offsets from the target tasks, without additional supervision. Expand
...
1
2
3
4
5
...