Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks

@article{Chattopadhyay2018GradCAMGG,
  title={Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks},
  author={Aditya Chattopadhyay and Anirban Sarkar and Prantik Howlader and Vineeth N. Balasubramanian},
  journal={2018 IEEE Winter Conference on Applications of Computer Vision (WACV)},
  year={2018},
  pages={839-847}
}
Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in solving complex vision based problems. [...] Key Method We provide a mathematical explanation for the proposed method, Grad-CAM++, which uses a weighted combination of the positive partial derivatives of the last convolutional layer feature maps with respect to a specific class score as weights to generate a visual explanation for the class label under consideration. Our extensive experiments and evaluations, both…Expand
Smooth Grad-CAM++: An Enhanced Inference Level Visualization Technique for Deep Convolutional Neural Network Models
TLDR
The Smooth Grad-CAM++ technique provides the capability of either visualizing a layer, subset of feature maps, or subset of neurons within a feature map at each instance at the inference level (model prediction process).
Integrated Grad-Cam: Sensitivity-Aware Visual Explanation of Deep Convolutional Networks Via Integrated Gradient-Based Scoring
TLDR
This work introduces a solution to tackle the issue of underestimating the contribution of the representations discovered by the model to its predictions by computing the path integral of the gradient-based terms in Grad-CAM.
Relevance-CAM: Your Model Already Knows Where to Look
With increasing fields of application for neural networks and the development of neural networks, the ability to explain deep learning models is also becoming increasingly important. Especially,
Ablation-CAM: Visual Explanations for Deep Convolutional Network via Gradient-free Localization
  • S. Desai, H. G. Ramaswamy
  • Computer Science
    2020 IEEE Winter Conference on Applications of Computer Vision (WACV)
  • 2020
TLDR
This approach – Ablation-based Class Activation Mapping (Ablation CAM) uses ablation analysis to determine the importance of individual feature map units w.r.t. class to produce a coarse localization map highlighting the important regions in the image for predicting the concept.
MACE: Model Agnostic Concept Extractor for Explaining Image Classification Networks
TLDR
The MACE framework dissects the feature maps generated by a convolution network for an image to extract concept-based prototypical explanations and estimates the relevance of the extracted concepts to the pretrained model’s predictions, a critical aspect for explaining the individual class predictions, missing in existing approaches.
LIFT-CAM: Towards Better Explanations for Class Activation Mapping
TLDR
This paper revisits the intrinsic linearity of CAM w.r.t. the activation maps and introduces an efficient approximation method, referred to as LIFT-CAM, which can estimate the true SHAP values quickly and accurately and achieves better performances than the other previous CAM-based methods in qualitative and quantitative aspects.
Adapting Grad-CAM for Embedding Networks
TLDR
This work proposes an adaptation of the Grad-CAM method for embedding networks, and develops an efficient weight-transfer method to explain decisions for any image without back-propagation.
Towards Learning Spatially Discriminative Feature Representations
TLDR
A novel loss function, termed as CAM-loss, is proposed to constrain the embedded feature maps with the class activation maps (CAMs) which indicate the spatially discriminative regions of an image for particular categories.
Attention Backpropagation: A Backtracking Approach to Visualize the Discriminative Image Regions
TLDR
Attention Backpropagation will backpropagate the network's attention to produce an attention map highlighting the important regions in the image for predicting, a generalized method that can be used for all the networks without modification of the network architecture or retraining.
A Step Towards Exposing Bias in Trained Deep Convolutional Neural Network Models
TLDR
It is discovered that Smooth Grad-CAM++ produced more visually sharp maps with larger number of salient pixels highlighted in the given input images when compared with other methods.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 37 REFERENCES
Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
TLDR
This work proposes a technique for producing ‘visual explanations’ for decisions from a large class of Convolutional Neural Network (CNN)-based models, making them more transparent and explainable, and shows that even non-attention based models learn to localize discriminative regions of input image.
Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization
TLDR
It is shown that Guided Grad-CAM helps untrained users successfully discern a "stronger" deep network from a "weaker" one even when both networks make identical predictions, and also exposes the somewhat surprising insight that common CNN + LSTM models can be good at localizing discriminative input image regions despite not being trained on grounded image-text pairs.
Striving for Simplicity: The All Convolutional Net
TLDR
It is found that max-pooling can simply be replaced by a convolutional layer with increased stride without loss in accuracy on several image recognition benchmarks.
Learning Deep Features for Discriminative Localization
In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network (CNN) to have remarkable localization ability
Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps
TLDR
This paper addresses the visualisation of image classification models, learnt using deep Convolutional Networks (ConvNets), and establishes the connection between the gradient-based ConvNet visualisation methods and deconvolutional networks.
Large-Scale Video Classification with Convolutional Neural Networks
TLDR
This work studies multiple approaches for extending the connectivity of a CNN in time domain to take advantage of local spatio-temporal information and suggests a multiresolution, foveated architecture as a promising way of speeding up the training.
From image-level to pixel-level labeling with Convolutional Networks
TLDR
A Convolutional Neural Network-based model is proposed, which is constrained during training to put more weight on pixels which are important for classifying the image, and which beats the state of the art results in weakly supervised object segmentation task by a large margin.
Tell Me Where to Look: Guided Attention Inference Network
TLDR
This work makes attention maps an explicit and natural component of the end-to-end training for the first time and provides self-guidance directly on these maps by exploring supervision from the network itself to improve them, and seamlessly bridge the gap between using weak and extra supervision if available.
Very Deep Convolutional Networks for Large-Scale Image Recognition
TLDR
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Understanding Neural Networks Through Deep Visualization
TLDR
This work introduces several new regularization methods that combine to produce qualitatively clearer, more interpretable visualizations of convolutional neural networks.
...
1
2
3
4
...