Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations

  title={Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations},
  author={Pau Rodr{\'i}guez L{\'o}pez and Massimo Caccia and Alexandre Lacoste and Lee Zamparo and Issam H. Laradji and Laurent Charlin and David V{\'a}zquez},
  journal={2021 IEEE/CVF International Conference on Computer Vision (ICCV)},
Explainability for machine learning models has gained considerable attention within the research community given the importance of deploying more reliable machine-learning systems. In computer vision applications, generative counterfactual methods indicate how to perturb a model’s input to change its prediction, providing details about the model’s decision-making. Current methods tend to generate trivial counterfactuals about a model’s decisions, as they often suggest to exaggerate or remove… 

Figures and Tables from this paper

Diffusion Models for Counterfactual Explanations

DiME, a method allowing the generation of counterfactual images using the recent diffusion models by leveraging the guided generative diffusion process, is proposed, which surpasses previous State-of-the-Art results on 5 out of 6 metrics on CelebA.

Diverse Counterfactual Explanations for Anomaly Detection in Time Series

A model-agnostic algorithm that generates counterfactual ensemble explanations for time series anomaly detection models that can convey a richer interpretation of a model’s internal mechanism than existing methods is proposed.

OCTET: Object-aware Counterfactual Explanations

This work en-codes the query image into a latent space that is structured in a way to ease object-level manipulations, inspired by recent generative modeling works, and shows that this method can be adapted beyond classification, e.g., to explain semantic segmentation models.

Cycle-Consistent Counterfactuals by Latent Transformations

  • S. KhorramLi Fuxin
  • Computer Science
    2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2022
A novel approach, Cycle-Consistent Counterfactuals by Latent Transformations (C3LT), which learns a latent transformation that automatically generates visual CFs by steering in the latent space of generative models and is easily plugged into any state-of-the-art pretrained generative network.

STEEX: Steering Counterfactual Explanations with Semantics

This work proposes a new generative counterfactual explanation framework that produces plausible and sparse modifications which preserve the overall scene structure, and introduces the concept of “region-targetedcounterfactual explanations”, and a corresponding framework.

On Quantitative Evaluations of Counterfactuals

It is found that while most metrics behave as intended for sufficiently simple datasets, some fail to tell the difference between good and bad counterfactuals when the complexity increases, and it is concluded that a proper quantitative evaluation of visualcounterfactual examples should combine metrics to ensure that all aspects of good counterfactUALs are quantified.

Bayesian Hierarchical Models for Counterfactual Estimation

This work argues that it is beneficial to provide several alternative explanations rather than a single point solution and proposes a probabilistic paradigm to estimate a diverse set of counterfactuals, which leverage Bayesian hierarchical modeling to share information across different subgroups of a population.

"Why Here and Not There?" - Diverse Contrasting Explanations of Dimensionality Reduction

This work introduces the abstract concept of contrasting explanations for dimensionality reduction, and applies a realization of this concept to the specific application of explaining two dimensional data visualization.

When adversarial attacks become interpretable counterfactual explanations

It is argued that, when learning a 1-Lipschitz neural network with the dual loss of an optimal transportation problem, the gradient of the model is both the direction of the transportation plan and the direction to the closest adversarial attack, and the simple saliency map method becomes a reliable explanation.

Making Heads or Tails: Towards Semantically Consistent Visual Counterfactuals

This work presents a novel framework for computing visual counterfactual explanations based on two key ideas, enforcing that the replaced and replacer regions contain the same semantic part, resulting in more semantically consistent explanations.



GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium

This work proposes a two time-scale update rule (TTUR) for training GANs with stochastic gradient descent on arbitrary GAN loss functions and introduces the "Frechet Inception Distance" (FID) which captures the similarity of generated images to real ones better than the Inception Score.

Explanation by Progressive Exaggeration

This work proposes a method that explains the outcome of a classification black-box by gradually exaggerating the semantic effect of a given class by gradually changes the posterior probability from its original class to its negation.

xGEMs: Generating Examplars to Explain Black-Box Models

This work trains an unsupervised implicit generative model -- treated as a proxy to the data manifold -- to understand black-box classifier behavior by exploring the landscape of the underlying data manifold as data points cross decision boundaries.

VGGFace2: A Dataset for Recognising Faces across Pose and Age

A new large-scale face dataset named VGGFace2 is introduced, which contains 3.31 million images of 9131 subjects, with an average of 362.6 images for each subject, and the automated and manual filtering stages to ensure a high accuracy for the images of each identity are described.

Auto-Encoding Variational Bayes

A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.

Explaining Image Classifiers by Counterfactual Generation

This work can sample plausible image in-fills by conditioning a generative model on the rest of the image, and optimize to find the image regions that most change the classifier's decision after in-fill.

Conditional Generative Models for Counterfactual Explanations

A general framework to generate sparse, in-distributioncounterfactual model explanations which match a desired target prediction with a conditional generative model is proposed, allowing batches of counterfactual instances to be generated with a single forward pass.

Generative Counterfactuals for Neural Networks via Attribute-Informed Perturbation

A framework to generate counterfactuals specifically for raw data instances with the proposed Attribute-Informed Perturbation (AIP) and demonstrates the effectiveness, sample quality as well as efficiency of the designed framework, and shows the superiority over other alternatives.

Counterfactual Generative Networks

This work proposes to decompose the image generation process into independent causal mechanisms that train without direct supervision and allows for generating counterfactual images, and demonstrates the ability of the model to generate such images on MNIST and ImageNet.

Synbols: Probing Learning Algorithms with Synthetic Datasets

The tool's high-level interface provides a language for rapidly generating new distributions on the latent features, including various types of textures and occlusions, and to showcase the versatility of Synbols, it is used to dissect the limitations and flaws in standard learning algorithms in various learning setups.