Corpus ID: 235605956

How Well do Feature Visualizations Support Causal Understanding of CNN Activations?

  title={How Well do Feature Visualizations Support Causal Understanding of CNN Activations?},
  author={Roland S. Zimmermann and Judy Borowski and Robert Geirhos and Matthias Bethge and Thomas S. A. Wallis and Wieland Brendel},
One widely used approach towards understanding the inner workings of deep convolutional neural networks is to visualize unit responses via activation maximization. Feature visualizations via activation maximization are thought to provide humans with precise information about the image features that cause a unit to be activated. If this is indeed true, these synthetic images should enable humans to predict the effect of an intervention, such as whether occluding a certain patch of the image (say… Expand


Diverse feature visualizations reveal invariances in early layers of deep neural networks
It is found that even early convolutional layers in VGG-19 exhibit various forms of response invariance: near-perfect phase invariance in some units and invariance to local diffeomorphic transformations in others. Expand
Understanding Deep Networks via Extremal Perturbations and Smooth Masks
Some of the shortcomings of existing approaches to perturbation analysis are discussed and the concept of extremal perturbations are introduced, which are theoretically grounded and interpretable and allow us to remove all tunable weighing factors from the optimization problem. Expand
Understanding Neural Networks via Feature Visualization: A survey
This chapter reviews existing AM techniques in the literature; discusses a probabilistic interpretation for AM; and reviews the applications of AM in debugging and explaining networks. Expand
Towards falsifiable interpretability research
It is argued that interpretability research suffers from an over-reliance on intuition-based approaches that risk-and in some cases have caused-illusory progress and misleading conclusions, and a strategy to address these impediments is proposed in the form of a framework for strongly falsifiable interpretabilityResearch. Expand
Quantitative Evaluation of Machine Learning Explanations: A Human-Grounded Benchmark
A benchmark for image and text domains using multi-layer human attention masks aggregated from multiple human annotators is proposed and its utility for quantitative evaluation of model explanations is demonstrated by comparing it with human subjective ratings and ground-truth single-layer segmentation masks evaluations. Expand
Visualizing Higher-Layer Features of a Deep Network
This paper contrast and compare several techniques applied on Stacked Denoising Autoencoders and Deep Belief Networks, trained on several vision datasets, and shows that good qualitative interpretations of high level features represented by such models are possible at the unit level. Expand
A psychophysics approach for quantitative comparison of interpretable computer vision models
The results demonstrate that psychophysical experiments allow for robust quality assessment of transparency in machine learning and highlight the potential of methods from classical psychophysics for modern machine learning applications. Expand
Deep neural networks are easily fooled: High confidence predictions for unrecognizable images
This work takes convolutional neural networks trained to perform well on either the ImageNet or MNIST datasets and finds images with evolutionary algorithms or gradient ascent that DNNs label with high confidence as belonging to each dataset class, and produces fooling images, which are then used to raise questions about the generality of DNN computer vision. Expand
Understanding deep image representations by inverting them
Image representations, from SIFT and Bag of Visual Words to Convolutional Neural Networks (CNNs), are a crucial component of almost any image understanding system. Nevertheless, our understanding ofExpand
The effects of example-based explanations in a machine learning interface
It is suggested that examples can serve as a vehicle for explaining algorithmic behavior, but point to relative advantages and disadvantages of using different kinds of examples, depending on the goal. Expand