Interpretable Counterfactual Explanations Guided by Prototypes

@inproceedings{Looveren2021InterpretableCE,
  title={Interpretable Counterfactual Explanations Guided by Prototypes},
  author={Arnaud Van Looveren and Janis Klaise},
  booktitle={ECML/PKDD},
  year={2021}
}
We propose a fast, model agnostic method for finding interpretable counterfactual explanations of classifier predictions by using class prototypes. We show that class prototypes, obtained using either an encoder or through class specific k-d trees, significantly speed up the the search for counterfactual instances and result in more interpretable explanations. We introduce two novel metrics to quantitatively evaluate local interpretability at the instance level. We use these metrics to… Expand
Generating Interpretable Counterfactual Explanations By Implicit Minimisation of Epistemic and Aleatoric Uncertainties
TLDR
This work introduces a simple and fast method for generating interpretable CEs in a white-box setting without an auxiliary model, by using the predictive uncertainty of the classifier. Expand
Conditional Generative Models for Counterfactual Explanations
TLDR
A general framework to generate sparse, in-distributioncounterfactual model explanations which match a desired target prediction with a conditional generative model is proposed, allowing batches of counterfactual instances to be generated with a single forward pass. Expand
Ensemble of Counterfactual Explainers
In eXplainable Artificial Intelligence (XAI), several counterfactual explainers have been proposed, each focusing on some desirable properties of counterfactual instances: minimality, actionability,Expand
BEYOND TRIVIAL COUNTERFACTUAL GENERATIONS
  • 2020
Explainability of machine learning models has gained considerable attention within our research community given the importance of deploying more reliable machine-learning systems. Explanability canExpand
SCOUT: Self-Aware Discriminant Counterfactual Explanations
  • Pei Wang, N. Vasconcelos
  • Computer Science
  • 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2020
TLDR
It is argued that self-awareness, namely the ability to produce classification confidence scores, is important for the computation of discriminant explanations, which seek to identify regions where it is easy to discriminate between prediction and counter class. Expand
Counterfactual Explanations for Arbitrary Regression Models
TLDR
This work forms CFE search for regression models in a rigorous mathematical framework using differentiable potentials, which resolves robustness issues in threshold-based objectives and proves that in this framework, verifying the existence of counterfactuals is NP-complete and finding instances using such potentials is CLS-complete. Expand
On Generating Plausible Counterfactual and Semi-Factual Explanations for Deep Learning
TLDR
The present method, called PlausIble Exceptionality-based Contrastive Explanations (PIECE), modifies all exceptional features in a test image to be normal from the perspective of the counterfactual class, showing that PIECE not only generates the most plausiblecounterfactuals on several measures, but also the best semifactuals. Expand
CARE: Coherent Actionable Recourse based on Sound Counterfactual Explanations
Counterfactual explanation methods interpret the outputs of a machine learning model in the form of "what-if scenarios" without compromising the fidelity-interpretability trade-off. They explain howExpand
GANterfactual - Counterfactual Explanations for Medical Non-Experts using Generative Adversarial Learning
TLDR
GANterfactual is presented, an approach to generate such counterfactual image explanations based on adversarial image-to-image translation techniques that lead to significantly better results regarding mental models, explanation satisfaction, trust, emotions, and self-efficacy than two state-of-the art systems that work with saliency maps, namely LIME and LRP. Expand
ECINN: Efficient Counterfactuals from Invertible Neural Networks
TLDR
A method is proposed, ECINN, that utilizes the generative capacities of invertible neural networks for image classification to generate counterfactual examples efficiently and outperforms established methods that generate heatmap-based explanations. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 44 REFERENCES
Explaining machine learning classifiers through diverse counterfactual explanations
TLDR
This work proposes a framework for generating and evaluating a diverse set of counterfactual explanations based on determinantal point processes, and provides metrics that enable comparison ofcounterfactual-based methods to other local explanation methods. Expand
Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives
TLDR
A novel method that provides contrastive explanations justifying the classification of an input by a black box classifier such as a deep neural network is proposed and it is argued that such explanations are natural for humans and are used commonly in domains such as health care and criminology. Expand
TIP: Typifying the Interpretability of Procedures
TLDR
A novel notion of what it means to be interpretable is provided, looking past the usual association with human understanding, and a framework that allows for comparing interpretable procedures by linking it to important practical aspects such as accuracy and robustness is defined. Expand
Examples are not enough, learn to criticize! Criticism for Interpretability
TLDR
Motivated by the Bayesian model criticism framework, MMD-critic is developed, which efficiently learns prototypes and criticism, designed to aid human interpretability. Expand
Model Agnostic Contrastive Explanations for Structured Data
TLDR
This work proposes a method, Model Agnostic Contrastive Explanations Method (MACEM), to generate contrastive explanations for any classification model where one is able to only query the class probabilities for a desired input and quantitatively and qualitatively validate this approach over 5 public datasets covering diverse domains. Expand
Comparison-Based Inverse Classification for Interpretability in Machine Learning
TLDR
An inverse classification approach whose principle consists in determining the minimal changes needed to alter a prediction: in an instance-based framework, given a data point whose classification must be explained, the proposed method consists in identifying a close neighbor classified differently, where the closeness definition integrates a sparsity constraint. Expand
Anchors: High-Precision Model-Agnostic Explanations
We introduce a novel model-agnostic system that explains the behavior of complex models with high-precision rules called anchors, representing local, “sufficient” conditions for predictions. WeExpand
Interpreting Black Box Predictions using Fisher Kernels
TLDR
This work takes a novel look at black box interpretation of test predictions in terms of training examples, making use of Fisher kernels as the defining feature embedding of each data point, combined with Sequential Bayesian Quadrature (SBQ) for efficient selection of examples. Expand
Generating Contrastive Explanations with Monotonic Attribute Functions
TLDR
This paper proposes a method that can generate contrastive explanations for deep neural networks where aspects that are in themselves sufficient to justify the classification by the deep model are highlighted, but also new aspects which if added will change the classification. Expand
Prototypical Networks for Few-shot Learning
TLDR
This work proposes Prototypical Networks for few-shot classification, and provides an analysis showing that some simple design decisions can yield substantial improvements over recent approaches involving complicated architectural choices and meta-learning. Expand
...
1
2
3
4
5
...