# Interpretable Counterfactual Explanations Guided by Prototypes

@inproceedings{Looveren2021InterpretableCE, title={Interpretable Counterfactual Explanations Guided by Prototypes}, author={Arnaud Van Looveren and Janis Klaise}, booktitle={ECML/PKDD}, year={2021} }

We propose a fast, model agnostic method for finding interpretable counterfactual explanations of classifier predictions by using class prototypes. We show that class prototypes, obtained using either an encoder or through class specific k-d trees, significantly speed up the the search for counterfactual instances and result in more interpretable explanations. We introduce two novel metrics to quantitatively evaluate local interpretability at the instance level. We use these metrics to… Expand

#### Figures, Tables, and Topics from this paper

#### 108 Citations

Generating Interpretable Counterfactual Explanations By Implicit Minimisation of Epistemic and Aleatoric Uncertainties

- Computer Science, Mathematics
- AISTATS
- 2021

This work introduces a simple and fast method for generating interpretable CEs in a white-box setting without an auxiliary model, by using the predictive uncertainty of the classifier. Expand

Conditional Generative Models for Counterfactual Explanations

- Computer Science, Mathematics
- ArXiv
- 2021

A general framework to generate sparse, in-distributioncounterfactual model explanations which match a desired target prediction with a conditional generative model is proposed, allowing batches of counterfactual instances to be generated with a single forward pass. Expand

Ensemble of Counterfactual Explainers

- 2021

In eXplainable Artificial Intelligence (XAI), several counterfactual explainers have been proposed, each focusing on some desirable properties of counterfactual instances: minimality, actionability,… Expand

BEYOND TRIVIAL COUNTERFACTUAL GENERATIONS

- 2020

Explainability of machine learning models has gained considerable attention within our research community given the importance of deploying more reliable machine-learning systems. Explanability can… Expand

SCOUT: Self-Aware Discriminant Counterfactual Explanations

- Computer Science
- 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2020

It is argued that self-awareness, namely the ability to produce classification confidence scores, is important for the computation of discriminant explanations, which seek to identify regions where it is easy to discriminate between prediction and counter class. Expand

Counterfactual Explanations for Arbitrary Regression Models

- Computer Science
- ArXiv
- 2021

This work forms CFE search for regression models in a rigorous mathematical framework using differentiable potentials, which resolves robustness issues in threshold-based objectives and proves that in this framework, verifying the existence of counterfactuals is NP-complete and finding instances using such potentials is CLS-complete. Expand

On Generating Plausible Counterfactual and Semi-Factual Explanations for Deep Learning

- Computer Science
- AAAI
- 2021

The present method, called PlausIble Exceptionality-based Contrastive Explanations (PIECE), modifies all exceptional features in a test image to be normal from the perspective of the counterfactual class, showing that PIECE not only generates the most plausiblecounterfactuals on several measures, but also the best semifactuals. Expand

CARE: Coherent Actionable Recourse based on Sound Counterfactual Explanations

- Computer Science
- ArXiv
- 2021

Counterfactual explanation methods interpret the outputs of a machine learning model in the form of "what-if scenarios" without compromising the fidelity-interpretability trade-off. They explain how… Expand

GANterfactual - Counterfactual Explanations for Medical Non-Experts using Generative Adversarial Learning

- Computer Science
- 2020

GANterfactual is presented, an approach to generate such counterfactual image explanations based on adversarial image-to-image translation techniques that lead to significantly better results regarding mental models, explanation satisfaction, trust, emotions, and self-efficacy than two state-of-the art systems that work with saliency maps, namely LIME and LRP. Expand

ECINN: Efficient Counterfactuals from Invertible Neural Networks

- Computer Science
- ArXiv
- 2021

A method is proposed, ECINN, that utilizes the generative capacities of invertible neural networks for image classification to generate counterfactual examples efficiently and outperforms established methods that generate heatmap-based explanations. Expand

#### References

SHOWING 1-10 OF 44 REFERENCES

Explaining machine learning classifiers through diverse counterfactual explanations

- Computer Science, Mathematics
- FAT*
- 2020

This work proposes a framework for generating and evaluating a diverse set of counterfactual explanations based on determinantal point processes, and provides metrics that enable comparison ofcounterfactual-based methods to other local explanation methods. Expand

Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives

- Computer Science
- NeurIPS
- 2018

A novel method that provides contrastive explanations justifying the classification of an input by a black box classifier such as a deep neural network is proposed and it is argued that such explanations are natural for humans and are used commonly in domains such as health care and criminology. Expand

TIP: Typifying the Interpretability of Procedures

- Computer Science, Mathematics
- ArXiv
- 2017

A novel notion of what it means to be interpretable is provided, looking past the usual association with human understanding, and a framework that allows for comparing interpretable procedures by linking it to important practical aspects such as accuracy and robustness is defined. Expand

Examples are not enough, learn to criticize! Criticism for Interpretability

- Computer Science
- NIPS
- 2016

Motivated by the Bayesian model criticism framework, MMD-critic is developed, which efficiently learns prototypes and criticism, designed to aid human interpretability. Expand

Model Agnostic Contrastive Explanations for Structured Data

- Computer Science, Mathematics
- ArXiv
- 2019

This work proposes a method, Model Agnostic Contrastive Explanations Method (MACEM), to generate contrastive explanations for any classification model where one is able to only query the class probabilities for a desired input and quantitatively and qualitatively validate this approach over 5 public datasets covering diverse domains. Expand

Comparison-Based Inverse Classification for Interpretability in Machine Learning

- Computer Science
- IPMU
- 2018

An inverse classification approach whose principle consists in determining the minimal changes needed to alter a prediction: in an instance-based framework, given a data point whose classification must be explained, the proposed method consists in identifying a close neighbor classified differently, where the closeness definition integrates a sparsity constraint. Expand

Anchors: High-Precision Model-Agnostic Explanations

- Computer Science
- AAAI
- 2018

We introduce a novel model-agnostic system that explains the behavior of complex models with high-precision rules called anchors, representing local, “sufficient” conditions for predictions. We… Expand

Interpreting Black Box Predictions using Fisher Kernels

- Computer Science, Mathematics
- AISTATS
- 2019

This work takes a novel look at black box interpretation of test predictions in terms of training examples, making use of Fisher kernels as the defining feature embedding of each data point, combined with Sequential Bayesian Quadrature (SBQ) for efficient selection of examples. Expand

Generating Contrastive Explanations with Monotonic Attribute Functions

- Computer Science, Mathematics
- ArXiv
- 2019

This paper proposes a method that can generate contrastive explanations for deep neural networks where aspects that are in themselves sufficient to justify the classification by the deep model are highlighted, but also new aspects which if added will change the classification. Expand

Prototypical Networks for Few-shot Learning

- Computer Science, Mathematics
- NIPS
- 2017

This work proposes Prototypical Networks for few-shot classification, and provides an analysis showing that some simple design decisions can yield substantial improvements over recent approaches involving complicated architectural choices and meta-learning. Expand