# Sanity Checks for Saliency Maps

@inproceedings{Adebayo2018SanityCF, title={Sanity Checks for Saliency Maps}, author={Julius Adebayo and J. Gilmer and Michael Muelly and I. Goodfellow and Moritz Hardt and Been Kim}, booktitle={NeurIPS}, year={2018} }

Saliency methods have emerged as a popular tool to highlight features in an input deemed relevant for the prediction of a learned model. [...] Key Result We interpret our findings through an analogy with edge detection in images, a technique that requires neither training data nor model. Theory in the case of a linear model and a single-layer convolutional neural network supports our experimental findings. Expand

#### 610 Citations

Learning Reliable Visual Saliency For Model Explanations

- Computer Science
- IEEE Transactions on Multimedia
- 2020

A simple method that requires the output of the model to be close to the original output while learning an explanatory saliency mask is proposed, and this approach helps to improve the reliability of visual saliency methods by suppressing false saliency responses. Expand

DECOY-ENHANCED SALIENCY MAPS

- 2020

Saliency methods can make deep neural network predictions more interpretable by identifying a set of critical features in an input sample, such as pixels that contribute most strongly to a prediction… Expand

Robust Decoy-enhanced Saliency Maps.

- Computer Science
- 2020

Experimental results suggest that the aggregated saliency map could not only capture inter-feature dependence but robustify interpretation against previously described adversarial perturbation methods, and qualitatively and quantitatively outperforms existing methods. Expand

Faithful Saliency Maps: Explaining Neural Networks by Augmenting "Competition for Pixels"

- Computer Science
- 2020

This thesis examines competition for pixels, showing that the algorithm has serious issues in the few-class setting, and proposes an augmentation of the competition algorithm designed to address these issues, and experimentally verifies the effectiveness of said augmentation. Expand

Investigating and Simplifying Masking-based Saliency Methods for Model Interpretability

- Computer Science
- ArXiv
- 2020

Surprisingly, a well-tuned, relatively simple formulation of a masking-based saliency model outperforms many more complex approaches and is found to be the most important ingredients for high quality saliency map generation. Expand

There and Back Again: Revisiting Backpropagation Saliency Methods

- Computer Science
- 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2020

This work conducts a thorough analysis of backpropagation-based saliency methods and proposes a single framework under which several such methods can be unified, and introduces a class-sensitivity metric and a meta-learning inspired paradigm applicable to any saliency method for improving sensitivity to the output class being explained. Expand

Sanity Checks for Saliency Metrics

- Computer Science, Engineering
- AAAI
- 2020

This work investigates existing metrics for evaluating the fidelity of saliency methods and finds that there is little consistency in the literature in how such metrics are calculated, and shows that inconsistencies can have a significant effect on the measured fidelity. Expand

Robust saliency maps with decoy-enhanced saliency score

- Computer Science
- ArXiv
- 2020

The empirical results show both qualitatively and quantitatively that the proposed robust saliency scores method outperforms raw scores produced by three existing saliency methods, even in the presence of adversarial attacks. Expand

iGOS++: integrated gradient optimized saliency by bilateral perturbations

- Computer Science, Engineering
- CHIL
- 2021

iGOS++ is presented, a framework to generate saliency maps for blackbox networks by considering both removal and preservation of evidence and the bilateral total variation term is introduced to the optimization that improves the continuity of the saliency map especially under high resolution and with thin object parts. Expand

Investigating sanity checks for saliency maps with image and text classification

- Computer Science
- ArXiv
- 2021

This paper performs sanity checks for text modality and shows that the conclusions made for image do not directly transfer to text, and investigates the effects of the input multiplier in certain saliency maps using similarity scores, max-sensitivity and infidelity evaluation metrics. Expand

#### References

SHOWING 1-10 OF 45 REFERENCES

The (Un)reliability of saliency methods

- Computer Science, Mathematics
- Explainable AI
- 2019

This work uses a simple and common pre-processing step ---adding a constant shift to the input data--- to show that a transformation with no effect on the model can cause numerous methods to incorrectly attribute. Expand

Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values

- Computer Science, Mathematics
- ICLR
- 2018

Somewhat surprisingly, it is found that DNNs with randomly-initialized weights produce explanations that are both visually and quantitatively similar to those produced by DNN's with learned weights. Expand

Evaluating the Visualization of What a Deep Neural Network Has Learned

- Computer Science, Medicine
- IEEE Transactions on Neural Networks and Learning Systems
- 2017

A general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps and shows that the recently proposed layer-wise relevance propagation algorithm qualitatively and quantitatively provides a better explanation of what made a DNN arrive at a particular classification decision than the sensitivity-based approach or the deconvolution method. Expand

Real Time Image Saliency for Black Box Classifiers

- Computer Science, Mathematics
- NIPS
- 2017

A masking model is trained to manipulate the scores of the classifier by masking salient parts of the input image to generalise well to unseen images and requires a single forward pass to perform saliency detection, therefore suitable for use in real-time systems. Expand

SmoothGrad: removing noise by adding noise

- Computer Science, Mathematics
- ArXiv
- 2017

SmoothGrad is introduced, a simple method that can help visually sharpen gradient-based sensitivity maps and lessons in the visualization of these maps are discussed. Expand

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

- Computer Science
- ICLR
- 2014

This paper addresses the visualisation of image classification models, learnt using deep Convolutional Networks (ConvNets), and establishes the connection between the gradient-based ConvNet visualisation methods and deconvolutional networks. Expand

Visualizing and Understanding Convolutional Networks

- Computer Science
- ECCV
- 2014

A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark. Expand

Interpretation of Neural Networks is Fragile

- Computer Science, Mathematics
- AAAI
- 2019

This paper systematically characterize the fragility of several widely-used feature-importance interpretation methods (saliency maps, relevance propagation, and DeepLIFT) on ImageNet and CIFAR-10 and extends these results to show that interpretations based on exemplars (e.g. influence functions) are similarly fragile. Expand

Deep Image Prior

- Computer Science, Mathematics
- 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018

It is shown that a randomly-initialized neural network can be used as a handcrafted prior with excellent results in standard inverse problems such as denoising, superresolution, and inpainting. Expand

Noise-adding Methods of Saliency Map as Series of Higher Order Partial Derivative

- Computer Science, Mathematics
- ArXiv
- 2018

This work analytically formalizes the result of noise-adding methods SmoothGrad and VarGrad and believes that it provides a clue to reveal the relationship between local explanation methods of deep neural networks and higher-order partial derivatives of the score function. Expand