Unifying Nonlocal Blocks for Neural Networks

  title={Unifying Nonlocal Blocks for Neural Networks},
  author={Lei Zhu and Qi She and Duo Li and Yanye Lu and Xuejing Kang and Jie Hu and Changhu Wang},
  journal={2021 IEEE/CVF International Conference on Computer Vision (ICCV)},
  • Lei ZhuQi She Changhu Wang
  • Published 5 August 2021
  • Computer Science
  • 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
The nonlocal-based blocks are designed for capturing long-range spatial-temporal dependencies in computer vision tasks. Although having shown excellent performance, they still lack the mechanism to encode the rich, structured information among elements in an image or video. In this paper, to theoretically analyze the property of these nonlocal-based blocks, we provide a new perspective to interpret them, where we view them as a set of graph filters generated on a fully-connected graph… 

Figures and Tables from this paper

MLPT: Multilayer Perceptron based Tracking

This paper presents a simple yet effective Multilayer Perceptron-based Tracking (MLPT), including the global receptive field, which is the first baseline of MLP-based architecture for object tracking.

Attention-Augmented Memory Network for Image Multi-Label Classification

Experimental results on standard benchmarks, including MS-COCO 2014, PASCAL VOC 2007, and VG-500, demonstrate the effectiveness and superiority of AAMN model, which outperforms current state-of-the-art methods.

Segmentation and Measurement of Superalloy Microstructure Based on Improved Nonlocal Block

The microstructure of superalloy materials has a decisive impact on its service performance. When preparing the material and photographing the microstructure, different depths of metallography

Bagging Regional Classification Activation Maps for Weakly Supervised Object Localization

This paper elaborates a plug-and-play mechanism called BagCAMs to better project a well-trained classifier for the localization task without refining or re-training the baseline structure and can improve the performance of baseline WSOL methods to a great extent.

Prediction of Prospecting Target Based on Selective Transfer Network

In recent years, with the integration and development of artificial intelligence technology and geology, traditional geological prospecting has begun to change to intelligent prospecting. Intelligent

WHU-OHS: A benchmark dataset for large-scale Hersepctral Image classification



Compact Generalized Non-local Network

This extension utilizes the compact representation for multiple kernel functions with Taylor expansion that makes the generalized non-local module in a fast and low-complexity computation flow and implements the generalizednon-local method within channel groups to ease the optimization.

Squeeze-and-Excitation Networks

This work proposes a novel architectural unit, which is term the “Squeeze-and-Excitation” (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels and shows that these blocks can be stacked together to form SENet architectures that generalise extremely effectively across different datasets.

A2-Nets: Double Attention Networks

This work proposes the "double attention block", a novel component that aggregates and propagates informative global features from the entire spatio-temporal space of input images/videos, enabling subsequent convolution layers to access featuresFrom the entire space efficiently.

Nonlocal Neural Networks, Nonlocal Diffusion and Nonlocal Modeling

The nature of diffusion and damping effect of nonlocal networks is studied by doing spectrum analysis on the weight matrices of the well-trained networks, and a new formulation of the nonlocal block is proposed, which not only learns the non local interactions but also has stable dynamics, thus allowing deeper nonlocal structures.

Non-local Neural Networks

This paper presents non-local operations as a generic family of building blocks for capturing long-range dependencies in computer vision and improves object detection/segmentation and pose estimation on the COCO suite of tasks.

Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks

This work proposes a simple, lightweight solution to the issue of limited context propagation in ConvNets, which propagates context across a group of neurons by aggregating responses over their extent and redistributing the aggregates back through the group.

Compact Global Descriptor for Neural Networks

A generic family of lightweight global descriptors for modeling the interactions between positions across different dimensions that enables subsequent convolutions to access the informative global features with negligible computational complexity and parameters is presented.

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

This work addresses the task of semantic image segmentation with Deep Learning and proposes atrous spatial pyramid pooling (ASPP), which is proposed to robustly segment objects at multiple scales, and improves the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models.

Deformable ConvNets V2: More Deformable, Better Results

This work presents a reformulation of Deformable Convolutional Networks that improves its ability to focus on pertinent image regions, through increased modeling power and stronger training, and guides network training via a proposed feature mimicking scheme that helps the network to learn features that reflect the object focus and classification power of R-CNN features.

Deformable Convolutional Networks

This work introduces two new modules to enhance the transformation modeling capability of CNNs, namely, deformable convolution and deformable RoI pooling, based on the idea of augmenting the spatial sampling locations in the modules with additional offsets and learning the offsets from the target tasks, without additional supervision.