Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
@article{Selvaraju2016GradCAMVE, title={Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization}, author={Ramprasaath R. Selvaraju and Abhishek Das and Ramakrishna Vedantam and Michael Cogswell and Devi Parikh and Dhruv Batra}, journal={International Journal of Computer Vision}, year={2016}, volume={128}, pages={336-359} }
We propose a technique for producing ‘visual explanations’ for decisions from a large class of Convolutional Neural Network (CNN)-based models, making them more transparent and explainable. [] Key Method Unlike previous approaches, Grad-CAM is applicable to a wide variety of CNN model-families: (1) CNNs with fully-connected layers (e.g.
9,421 Citations
Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks.
- Computer Science
- 2018
This paper proposes a generalized method called Grad-CAM++ that can provide better visual explanations of CNN model predictions, in terms of better object localization as well as explaining occurrences of multiple object instances in a single image, when compared to state-of-the-art.
Ablation-CAM: Visual Explanations for Deep Convolutional Network via Gradient-free Localization
- Computer Science2020 IEEE Winter Conference on Applications of Computer Vision (WACV)
- 2020
This approach – Ablation-based Class Activation Mapping (Ablation CAM) uses ablation analysis to determine the importance of individual feature map units w.r.t. class to produce a coarse localization map highlighting the important regions in the image for predicting the concept.
Visual Explanations from Deep Networks via Riemann-Stieltjes Integrated Gradient-based Localization
- Computer ScienceISVC
- 2022
This work introduces a new technique to produce visual explanations for the predictions of a CNN that can be applied to any layer of the network, and is not affected by the problem of vanishing gradients.
Smooth Grad-CAM++: An Enhanced Inference Level Visualization Technique for Deep Convolutional Neural Network Models
- Computer ScienceArXiv
- 2019
The Smooth Grad-CAM++ technique provides the capability of either visualizing a layer, subset of feature maps, or subset of neurons within a feature map at each instance at the inference level (model prediction process).
Adapting Grad-CAM for Embedding Networks
- Computer Science2020 IEEE Winter Conference on Applications of Computer Vision (WACV)
- 2020
This work proposes an adaptation of the Grad-CAM method for embedding networks, and develops an efficient weight-transfer method to explain decisions for any image without back-propagation.
Eigen-CAM: Visual Explanations for Deep Convolutional Neural Networks
- Computer ScienceSN Comput. Sci.
- 2021
Novel Eigen-CAM is presented to enhance explanations of CNN predictions by visualizing principal components of learned representations from convolutional layers that are more consistent, class discriminative, and robust against classification errors made by dense layers.
MACE: Model Agnostic Concept Extractor for Explaining Image Classification Networks
- Computer ScienceIEEE Transactions on Artificial Intelligence
- 2021
The MACE framework dissects the feature maps generated by a convolution network for an image to extract concept-based prototypical explanations and estimates the relevance of the extracted concepts to the pretrained model’s predictions, a critical aspect for explaining the individual class predictions, missing in existing approaches.
Abs-CAM: A Gradient Optimization Interpretable Approach for Explanation of Convolutional Neural Networks
- Computer ScienceSignal, Image and Video Processing
- 2022
An Absolute value Class Activation Mapping-based (Abs-CAM) method, which optimizes the gradients derived from the backpropagation and turns all of them into positive gradients to enhance the visual features of output neurons’ activation, and improve the localization ability of the saliency map.
Group-CAM: Group Score-Weighted Visual Explanations for Deep Convolutional Networks
- Computer ScienceArXiv
- 2021
This paper proposes an efficient saliency map generation method, called Group score-weighted Class Activation Mapping (GroupCAM), which adopts the “split-transform-merge” strategy to generate saliency maps.
Review of white box methods for explanations of convolutional neural networks in image classification tasks
- Computer ScienceJ. Electronic Imaging
- 2021
This work aims to provide a comprehensive and detailed overview of a set of methods that can be used to create explanation maps for a particular image, which assign an importance score to each pixel of the image based on its contribution to the decision of the network.
References
SHOWING 1-10 OF 76 REFERENCES
Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization
- Computer Science
- 2016
It is shown that Guided Grad-CAM helps untrained users successfully discern a "stronger" deep network from a "weaker" one even when both networks make identical predictions, and also exposes the somewhat surprising insight that common CNN + LSTM models can be good at localizing discriminative input image regions despite not being trained on grounded image-text pairs.
Fully convolutional networks for semantic segmentation
- Computer Science2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2015
The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.
Network Dissection: Quantifying Interpretability of Deep Visual Representations
- Computer Science2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017
This work uses the proposed Network Dissection method to test the hypothesis that interpretability is an axis-independent property of the representation space, then applies the method to compare the latent representations of various networks when trained to solve different classification problems.
Choose Your Neuron: Incorporating Domain Knowledge through Neuron-Importance
- Computer ScienceECCV
- 2018
This work learns to map domain knowledge about novel “unseen” classes onto this dictionary of learned concepts and optimizes for network parameters that can effectively combine these concepts – essentially learning classifiers by discovering and composing learned semantic concepts in deep networks.
Visualizing Deep Convolutional Neural Networks Using Natural Pre-images
- Computer ScienceInternational Journal of Computer Vision
- 2016
This paper studies several landmark representations, both shallow and deep, by a number of complementary visualization techniques based on the concept of “natural pre-image”, and shows that several layers in CNNs retain photographically accurate information about the image, with different degrees of geometric and photometric invariance.
Striving for Simplicity: The All Convolutional Net
- Computer ScienceICLR
- 2015
It is found that max-pooling can simply be replaced by a convolutional layer with increased stride without loss in accuracy on several image recognition benchmarks.
Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps
- Computer ScienceICLR
- 2014
This paper addresses the visualisation of image classification models, learnt using deep Convolutional Networks (ConvNets), and establishes the connection between the gradient-based ConvNet visualisation methods and deconvolutional networks.
Learning Deep Features for Discriminative Localization
- Computer Science2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2016
In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network (CNN) to have remarkable localization ability…
Self-taught object localization with deep networks
- Computer Science2016 IEEE Winter Conference on Applications of Computer Vision (WACV)
- 2016
This paper introduces self-taught object localization, a novel approach that leverages deep convolutional networks trained for whole-image recognition to localize objects in images without additional…
Hierarchical Question-Image Co-Attention for Visual Question Answering
- Computer ScienceNIPS
- 2016
This paper presents a novel co-attention model for VQA that jointly reasons about image and question attention in a hierarchical fashion via a novel 1-dimensional convolution neural networks (CNN).