Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization

@article{Selvaraju2019GradCAMVE,
  title={Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization},
  author={Ramprasaath R. Selvaraju and Abhishek Das and Ramakrishna Vedantam and Michael Cogswell and Devi Parikh and Dhruv Batra},
  journal={International Journal of Computer Vision},
  year={2019},
  volume={128},
  pages={336-359}
}
We propose a technique for producing ‘visual explanations’ for decisions from a large class of Convolutional Neural Network (CNN)-based models, making them more transparent and explainable. [...] Key Method Unlike previous approaches, Grad-CAM is applicable to a wide variety of CNN model-families: (1) CNNs with fully-connected layers (e.g.Expand
Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks
TLDR
This paper proposes Grad-CAM++, which uses a weighted combination of the positive partial derivatives of the last convolutional layer feature maps with respect to a specific class score as weights to generate a visual explanation for the class label under consideration, to provide better visual explanations of CNN model predictions. Expand
Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks.
TLDR
This paper proposes a generalized method called Grad-CAM++ that can provide better visual explanations of CNN model predictions, in terms of better object localization as well as explaining occurrences of multiple object instances in a single image, when compared to state-of-the-art. Expand
Ablation-CAM: Visual Explanations for Deep Convolutional Network via Gradient-free Localization
TLDR
This approach – Ablation-based Class Activation Mapping (Ablation CAM) uses ablation analysis to determine the importance of individual feature map units w.r.t. class to produce a coarse localization map highlighting the important regions in the image for predicting the concept. Expand
Smooth Grad-CAM++: An Enhanced Inference Level Visualization Technique for Deep Convolutional Neural Network Models
TLDR
The Smooth Grad-CAM++ technique provides the capability of either visualizing a layer, subset of feature maps, or subset of neurons within a feature map at each instance at the inference level (model prediction process). Expand
Adapting Grad-CAM for Embedding Networks
TLDR
This work proposes an adaptation of the Grad-CAM method for embedding networks, and develops an efficient weight-transfer method to explain decisions for any image without back-propagation. Expand
MACE: Model Agnostic Concept Extractor for Explaining Image Classification Networks
TLDR
This work proposes MACE: a Model Agnostic Concept Extractor, which can explain the working of a convolutional network through smaller concepts, and demonstrates that the concepts extracted increase the human interpretability of the explanations, and are faithful to the underlying pre-trained black-box model. Expand
Review of white box methods for explanations of convolutional neural networks in image classification tasks
TLDR
This work aims to provide a comprehensive and detailed overview of a set of methods that can be used to create explanation maps for a particular image, which assign an importance score to each pixel of the image based on its contribution to the decision of the network. Expand
Attend and Guide (AG-Net): A Keypoints-Driven Attention-Based Deep Network for Image Recognition
TLDR
An end-to-end CNN model, which learns meaningful features linking fine-grained changes using the authors' novel attention mechanism, which captures the spatial structures in images by identifying semantic regions (SRs) and their spatial distributions and is proved to be the key to modeling subtle changes in images. Expand
Context-based image explanations for deep neural networks
TLDR
A model-agnostic method for generating context-based explanations aiming for general users and visually outperformed existing gradient and occlusion based methods to explain models’ decisions to non-expert users in real-world applications is proposed. Expand
Wider Vision: Enriching Convolutional Neural Networks via Alignment to External Knowledge Bases
TLDR
The demonstrated approach of aligning a CNN with an external knowledge base paves the way to reason about and beyond the trained model, with future adaptations to explainable models and zero-shot learning. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 101 REFERENCES
Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization
TLDR
It is shown that Guided Grad-CAM helps untrained users successfully discern a "stronger" deep network from a "weaker" one even when both networks make identical predictions, and also exposes the somewhat surprising insight that common CNN + LSTM models can be good at localizing discriminative input image regions despite not being trained on grounded image-text pairs. Expand
Network Dissection: Quantifying Interpretability of Deep Visual Representations
TLDR
This work uses the proposed Network Dissection method to test the hypothesis that interpretability is an axis-independent property of the representation space, then applies the method to compare the latent representations of various networks when trained to solve different classification problems. Expand
Choose Your Neuron: Incorporating Domain Knowledge through Neuron-Importance
TLDR
This work learns to map domain knowledge about novel “unseen” classes onto this dictionary of learned concepts and optimizes for network parameters that can effectively combine these concepts – essentially learning classifiers by discovering and composing learned semantic concepts in deep networks. Expand
Visualizing Deep Convolutional Neural Networks Using Natural Pre-images
TLDR
This paper studies several landmark representations, both shallow and deep, by a number of complementary visualization techniques based on the concept of “natural pre-image”, and shows that several layers in CNNs retain photographically accurate information about the image, with different degrees of geometric and photometric invariance. Expand
Striving for Simplicity: The All Convolutional Net
TLDR
It is found that max-pooling can simply be replaced by a convolutional layer with increased stride without loss in accuracy on several image recognition benchmarks. Expand
Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps
TLDR
This paper addresses the visualisation of image classification models, learnt using deep Convolutional Networks (ConvNets), and establishes the connection between the gradient-based ConvNet visualisation methods and deconvolutional networks. Expand
Learning Deep Features for Discriminative Localization
In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network (CNN) to have remarkable localization abilityExpand
Self-taught object localization with deep networks
This paper introduces self-taught object localization, a novel approach that leverages deep convolutional networks trained for whole-image recognition to localize objects in images without additionalExpand
Hierarchical Question-Image Co-Attention for Visual Question Answering
TLDR
This paper presents a novel co-attention model for VQA that jointly reasons about image and question attention in a hierarchical fashion via a novel 1-dimensional convolution neural networks (CNN). Expand
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
TLDR
A Fully Convolutional Localization Network (FCLN) architecture is proposed that processes an image with a single, efficient forward pass, requires no external regions proposals, and can be trained end-to-end with asingle round of optimization. Expand
...
1
2
3
4
5
...