Residual Attention Network for Image Classification

  title={Residual Attention Network for Image Classification},
  author={Fei Wang and Mengqing Jiang and Chen Qian and Shuo Yang and Cheng Li and Honggang Zhang and Xiaogang Wang and Xiaoou Tang},
  journal={2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
In this work, we propose Residual Attention Network, a convolutional neural network using attention mechanism which can incorporate with state-of-art feed forward network architecture in an end-to-end training fashion. [] Key Method Inside each Attention Module, bottom-up top-down feedforward structure is used to unfold the feedforward and feedback attention process into a single feedforward process.

Figures and Tables from this paper

Improvement of Residual Attention Network for Image Classification
A new improvement method of residual attention network for image classification, which applies several upsampling schemes to the RAN process, i.e., the stacked network structure extraction, and the bottom-up and top-down feedforward attention for residual learning.
An Attention Module for Convolutional Neural Networks
This work proposes an attention module for convolutional neural networks by developing an AW-convolution, where the shape of attention maps matches that of the weights rather than the activations, and shows the effectiveness of this module on several datasets for image classification and object detection tasks.
Multi-layer Attention Aggregation in Deep Neural Network
  • Zetan Zhang
  • Computer Science
    2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference (ITAIC)
  • 2019
The proposed Multi-layer attention aggregation (MAA) model, a convolutional architecture using attention mechanism and global aggregation module iteratively, improves the performance of image classification by merging attention-aware features from every Convolutional stage.
Dense Attention Convolutional Network for Image Classification
A dense attention convolutional neural network (DA-CNN) for visual recognition is built that outperforms many state-of-the-art methods and the effectiveness of the dense attention learning and channel-wise attention module is validated.
Learning Connected Attentions for Convolutional Neural Networks
  • Xu Ma, Jingda Guo, Song Fu
  • Computer Science
    2021 IEEE International Conference on Multimedia and Expo (ICME)
  • 2021
Deep Connected Attention Network (DCANet), a novel design that boosts attention modules in a CNN model without any modification of the internal structure, to achieve this, interconnect adjacent attention blocks, making information flow among attention blocks possible.
Improved Residual Networks for Image and Video Recognition
The proposed improvements address all three main components of a ResNet: the flow of information through the network layers, the residual building block, and the projection shortcut, and are able to show consistent improvements in accuracy and learning convergence over the baseline.
Loss-Based Attention for Interpreting Image-Level Prediction of Convolutional Neural Networks
This paper proposes a novel and general attention mechanism, loss-based attention, upon which deep neural networks are modified to mine significant image patches for explaining which parts determine the image decision-making.
Residual Attention Convolutional Network for Online Visual Tracking
A residual attention module is proposed to the one layer convolutional network to inhibit the descent of discriminative ability caused by over fitness to achieve favorable performance compared with the state-of-art trackers.
BA^2M: A Batch Aware Attention Module for Image Classification
A batch aware attention module (BAM) for feature enrichment from a distinctive perspective that can boost the performance of various network architectures and outperforms many classical attention methods.


The application of two-level attention models in deep convolutional neural network for fine-grained image classification
This paper proposes to apply visual attention to fine-grained classification task using deep neural network and achieves the best accuracy under the weakest supervision condition, and is competitive against other methods that rely on additional annotations.
Attention to Scale: Scale-Aware Semantic Image Segmentation
An attention mechanism that learns to softly weight the multi-scale features at each pixel location is proposed, which not only outperforms averageand max-pooling, but allows us to diagnostically visualize the importance of features at different positions and scales.
Deep Networks with Internal Selective Attention through Feedback Connections
DasNet harnesses the power of sequential processing to improve classification performance, by allowing the network to iteratively focus its internal attention on some of its convolutional filters.
Look and Think Twice: Capturing Top-Down Visual Attention with Feedback Convolutional Neural Networks
The background of feedbacks in the human visual cortex is introduced, which motivates the development of a computational feedback mechanism in deep neural networks, and a feedback loop is introduced to infer the activation status of hidden layer neurons according to the "goal" of the network.
Deep Residual Learning for Image Recognition
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
Diversified Visual Attention Networks for Fine-Grained Object Classification
A diversified visual attention network (DVAN) is proposed to address the problem of fine-grained object classification, which substantially relieves the dependency on strongly supervised information for learning to localize discriminative regions com-pared with attention-less models.
Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
Clear empirical evidence that training with residual connections accelerates the training of Inception networks significantly is given and several new streamlined architectures for both residual and non-residual Inception Networks are presented.
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities.
Going deeper with convolutions
We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-Wise Labelling
The results show that SegNet achieves state-of-the-art performance even without use of additional cues such as depth, video frames or post-processing with CRF models.