Corpus ID: 235349029

Invertible Concept-based Explanations for CNN Models with Non-negative Concept Activation Vectors

  title={Invertible Concept-based Explanations for CNN Models with Non-negative Concept Activation Vectors},
  author={Ruihan Zhang and Prashan Madumal and Tim Miller and Krista A. Ehinger and Benjamin I. P. Rubinstein},
Convolutional neural network (CNN) models for computer vision are powerful but lack explainability in their most basic form. This deficiency remains a key challenge when applying CNNs in important domains. Recent work on explanations through feature importance of approximate linear models has moved from input-level features (pixels or segments) to features from mid-layer feature maps in the form of concept activation vectors (CAVs). CAVs contain concept-level information and could be learned… Expand

Figures and Tables from this paper


Deep Residual Learning for Image Recognition
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. Expand
Metrics for Explainable AI: Challenges and Prospects
This paper discusses specific methods for evaluating the goodness of explanations, whether users are satisfied by explanations, how well users understand the AI systems, and how the human-XAI work system performs. Expand
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
LIME is proposed, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning aninterpretable model locally varound the prediction. Expand
The Caltech-UCSD Birds-200-2011 Dataset
CUB-200-2011 is an extended version of CUB-200 [7], a challenging dataset of 200 bird species. The extended version roughly doubles the number of images per category and adds new part localizationExpand
CoCoX: Generating Conceptual and Counterfactual Explanations via Fault-Lines
It is argued that, due to the conceptual and counterfactual nature of fault-lines, the CoCoX explanations are practical and more natural for both expert and non-expert users to understand the internal workings of complex deep learning models. Expand
Towards falsifiable interpretability research
It is argued that interpretability research suffers from an over-reliance on intuition-based approaches that risk-and in some cases have caused-illusory progress and misleading conclusions, and a strategy to address these impediments is proposed in the form of a framework for strongly falsifiable interpretabilityResearch. Expand
Explanations can be manipulated and geometry is to blame
It is shown that explanations can be manipulated arbitrarily by applying visually hardly perceptible perturbations to the input that keep the network's output approximately constant, and theoretically this phenomenon can be related to certain geometrical properties of neural networks. Expand
Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
This work proposes a technique for producing ‘visual explanations’ for decisions from a large class of Convolutional Neural Network (CNN)-based models, making them more transparent and explainable, and shows that even non-attention based models learn to localize discriminative regions of input image. Expand
Responsible Artificial Intelligence: How to Develop and Use AI in a Responsible Way
  • V. Dignum
  • Computer Science
  • Artificial Intelligence: Foundations, Theory, and Algorithms
  • 2019
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
  • C. Rudin
  • Computer Science
  • Nature Machine Intelligence
  • 2019
This Perspective clarifies the chasm between explaining black boxes and using inherently interpretable models, outlines several key reasons why explainable black boxes should be avoided in high-stakes decisions, identifies challenges to interpretable machine learning, and provides several example applications whereinterpretable models could potentially replace black box models in criminal justice, healthcare and computer vision. Expand