• Corpus ID: 228375006

Dependency Decomposition and a Reject Option for Explainable Models

  title={Dependency Decomposition and a Reject Option for Explainable Models},
  author={Jan Kronenberger and Anselm Haselhoff},
Deploying machine learning models in safety-related do-mains (e.g. autonomous driving, medical diagnosis) demands for approaches that are explainable, robust against adversarial attacks and aware of the model uncertainty. Recent deep learning models perform extremely well in various inference tasks, but the black-box nature of these approaches leads to a weakness regarding the three requirements mentioned above. Recent advances offer methods to visualize features, describe attribution of the… 

Figures and Tables from this paper

Towards Black-Box Explainability with Gaussian Discriminant Knowledge Distillation

A Concept Probability Density Encoder (CPDE) in conjunction with a Gaussian Discriminant Decoder (GDD) to describe the contribution of high-level concepts in the context of object detection utilizing the DensePose dataset is introduced.

Concept Embedding Analysis: A Review

A general definition of CA and a taxonomy for CA methods are established, uniting several ideas from literature, which allows to easy position and compare CA approaches.



Interpretable Explanations of Black Boxes by Meaningful Perturbation

A general framework for learning different kinds of explanations for any black box algorithm is proposed and the framework to find the part of an image most responsible for a classifier decision is specialised.

"Why Should I Trust You?": Explaining the Predictions of Any Classifier

LIME is proposed, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning aninterpretable model locally varound the prediction.

Explaining and Harnessing Adversarial Examples

It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.

On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation

This work proposes a general solution to the problem of understanding classification decisions by pixel-wise decomposition of nonlinear classifiers by introducing a methodology that allows to visualize the contributions of single pixels to predictions for kernel-based classifiers over Bag of Words features and for multilayered neural networks.

Jointly Optimize Data Augmentation and Network Training: Adversarial Data Augmentation in Human Pose Estimation

The key idea is to design a generator that competes against a discriminator that explores weaknesses of the discriminators, while the discriminator learns from hard augmentations to achieve better performance.

Generating Visual Explanations

A new model is proposed that focuses on the discriminating properties of the visible object, jointly predicts a class label, and explains why the predicted label is appropriate for the image, and generates sentences that realize a global sentence property, such as class specificity.

Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization

This work proposes a technique for producing ‘visual explanations’ for decisions from a large class of Convolutional Neural Network (CNN)-based models, making them more transparent and explainable, and shows that even non-attention based models learn to localize discriminative regions of input image.

Visualizing Higher-Layer Features of a Deep Network

This paper contrast and compare several techniques applied on Stacked Denoising Autoencoders and Deep Belief Networks, trained on several vision datasets, and shows that good qualitative interpretations of high level features represented by such models are possible at the unit level.

PatternNet and PatternLRP - Improving the interpretability of neural networks

An improved method is proposed that may serve as an extension for existing back-projection and decomposition techniques and formulate a quality criterion for explanation methods.

Procedural Noise Adversarial Examples for Black-Box Attacks on Deep Convolutional Networks

This paper introduces a structured approach for generating Universal Adversarial Perturbations (UAPs) with procedural noise functions, and unveils the systemic vulnerability of popular DCN models like Inception v3 and YOLO v3, with single noise patterns able to fool a model on up to 90% of the dataset.