• Corpus ID: 52938797

Sanity Checks for Saliency Maps

@inproceedings{Adebayo2018SanityCF,
  title={Sanity Checks for Saliency Maps},
  author={Julius Adebayo and Justin Gilmer and Michael Muelly and Ian J. Goodfellow and Moritz Hardt and Been Kim},
  booktitle={NeurIPS},
  year={2018}
}
Saliency methods have emerged as a popular tool to highlight features in an input deemed relevant for the prediction of a learned model. [...] Key Result We interpret our findings through an analogy with edge detection in images, a technique that requires neither training data nor model. Theory in the case of a linear model and a single-layer convolutional neural network supports our experimental findings.Expand
Learning Reliable Visual Saliency For Model Explanations
TLDR
A simple method that requires the output of the model to be close to the original output while learning an explanatory saliency mask is proposed, and this approach helps to improve the reliability of visual saliency methods by suppressing false saliency responses.
DECOY-ENHANCED SALIENCY MAPS
  • 2020
Saliency methods can make deep neural network predictions more interpretable by identifying a set of critical features in an input sample, such as pixels that contribute most strongly to a prediction
Robust Decoy-enhanced Saliency Maps.
TLDR
Experimental results suggest that the aggregated saliency map could not only capture inter-feature dependence but robustify interpretation against previously described adversarial perturbation methods, and qualitatively and quantitatively outperforms existing methods.
Revisiting Sanity Checks for Saliency Maps
TLDR
This work challenges the utility of the sanity check methodology, and highlights that saliency map evaluation beyond ad-hoc visual examination remains a fundamental challenge.
Faithful Saliency Maps: Explaining Neural Networks by Augmenting "Competition for Pixels"
TLDR
This thesis examines competition for pixels, showing that the algorithm has serious issues in the few-class setting, and proposes an augmentation of the competition algorithm designed to address these issues, and experimentally verifies the effectiveness of said augmentation.
Investigating and Simplifying Masking-based Saliency Methods for Model Interpretability
TLDR
Surprisingly, a well-tuned, relatively simple formulation of a masking-based saliency model outperforms many more complex approaches and is found to be the most important ingredients for high quality saliency map generation.
There and Back Again: Revisiting Backpropagation Saliency Methods
TLDR
This work conducts a thorough analysis of backpropagation-based saliency methods and proposes a single framework under which several such methods can be unified, and introduces a class-sensitivity metric and a meta-learning inspired paradigm applicable to any saliency method for improving sensitivity to the output class being explained.
Sanity Checks for Saliency Metrics
TLDR
This work investigates existing metrics for evaluating the fidelity of saliency methods and finds that there is little consistency in the literature in how such metrics are calculated, and shows that such inconsistencies can have a significant effect on the measured fidelity.
Focus! Rating XAI Methods and Finding Biases with Mosaics
TLDR
A consistent evaluation metric for feature attribution methods — the Focus — designed to quantify their coherency to the task is proposed, and some methods are found to be consistently reliable, while others produce class-agnostic explanations.
Robust saliency maps with decoy-enhanced saliency score
TLDR
The empirical results show both qualitatively and quantitatively that the proposed robust saliency scores method outperforms raw scores produced by three existing saliency methods, even in the presence of adversarial attacks.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 45 REFERENCES
The (Un)reliability of saliency methods
TLDR
This work uses a simple and common pre-processing step ---adding a constant shift to the input data--- to show that a transformation with no effect on the model can cause numerous methods to incorrectly attribute.
Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values
TLDR
Somewhat surprisingly, it is found that DNNs with randomly-initialized weights produce explanations that are both visually and quantitatively similar to those produced by DNN's with learned weights.
Evaluating the Visualization of What a Deep Neural Network Has Learned
TLDR
A general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps and shows that the recently proposed layer-wise relevance propagation algorithm qualitatively and quantitatively provides a better explanation of what made a DNN arrive at a particular classification decision than the sensitivity-based approach or the deconvolution method.
Real Time Image Saliency for Black Box Classifiers
TLDR
A masking model is trained to manipulate the scores of the classifier by masking salient parts of the input image to generalise well to unseen images and requires a single forward pass to perform saliency detection, therefore suitable for use in real-time systems.
SmoothGrad: removing noise by adding noise
TLDR
SmoothGrad is introduced, a simple method that can help visually sharpen gradient-based sensitivity maps and lessons in the visualization of these maps are discussed.
Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps
TLDR
This paper addresses the visualisation of image classification models, learnt using deep Convolutional Networks (ConvNets), and establishes the connection between the gradient-based ConvNet visualisation methods and deconvolutional networks.
Visualizing and Understanding Convolutional Networks
TLDR
A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark.
Deep Image Prior
TLDR
It is shown that a randomly-initialized neural network can be used as a handcrafted prior with excellent results in standard inverse problems such as denoising, superresolution, and inpainting.
Noise-adding Methods of Saliency Map as Series of Higher Order Partial Derivative
TLDR
This work analytically formalizes the result of noise-adding methods SmoothGrad and VarGrad and believes that it provides a clue to reveal the relationship between local explanation methods of deep neural networks and higher-order partial derivatives of the score function.
Grad-CAM: Why did you say that?
We propose a technique for making Convolutional Neural Network (CNN)-based models more transparent by visualizing input regions that are 'important' for predictions -- or visual explanations. Our
...
1
2
3
4
5
...