Evaluating and Mitigating Bias in Image Classifiers: A Causal Perspective Using Counterfactuals

  title={Evaluating and Mitigating Bias in Image Classifiers: A Causal Perspective Using Counterfactuals},
  author={Saloni Dash and Vineeth N. Balasubramanian and Amit Sharma},
  journal={2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
Counterfactual examples for an input—perturbations that change specific features but not others—have been shown to be useful for evaluating bias of machine learning models, e.g., against specific demographic groups. However, generating counterfactual examples for images is nontrivial due to the underlying causal structure on the various features of an image. To be meaningful, generated perturbations need to satisfy constraints implied by the causal model. We present a method for generating… 

Combining Counterfactuals With Shapley Values To Explain Image Models

This work develops a pipeline to generate counterfactuals and uses it to estimate Shapley values, which are used to obtain contrastive and interpretable explanations with strong axiomatic guarantees.

On Noise Abduction for Answering Counterfactual Queries: A Practical Outlook

It may not be necessary to abduct all the noise variables in a structural causal model (SCM) to answer a counterfactual query, and a graphical condition for noise identification from an action consisting of an arbitrary combination of hard and soft interventions is introduced.

Adversarial counterfactual augmentation: application in Alzheimer’s disease classification

A novel adversarial counterfactual augmentation scheme that aims at finding the most effective synthesised images to improve downstream tasks, given a pre-trained generative model.

Debiasing Methods for Fairer Neural Models in Vision and Language Research: A Survey

A novel taxonomy is proposed to better organize the literature on debiasing methods for fairness-aware neural networks in the context of vision and language research and discusses the current challenges, trends, and important future work directions for the interested researcher and practitioner.

Probing Classifiers are Unreliable for Concept Removal and Detection

It is proved that a probing classifier is likely to use non-concept features and thus post-hoc or adversarial methods will fail to remove the concept correctly, and a spuriousness metric is proposed to gauge the quality of the class.

Adaptation and Generalization for Unknown Sensitive Factors of Variations

This framework addresses AI assurance challenges lying at the intersection of domain adaptation, fairness, and counterfactuals analysis, operating via the discovery and intervention on factors of variations in data that significantly affect the robustness of AI models.

Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review

A rubric is designed with desirable properties of counterfactual explanation algorithms and all currently proposed algorithms against that rubric are evaluated, providing easy comparison and comprehension of the advantages and disadvantages of different approaches.

Causal Regularization Using Domain Priors

This work proposes a causal regularization method that can incorporate such causal domain priors into the network and which supports both direct and total causal effects, and shows that this approach can generalize to various kinds of specifications of causal priors.

Matching Learned Causal Effects of Neural Networks with Domain Priors

This work proposes a regularization method that aligns the learned causal effects of a neural network with domain priors, including both direct and total causal effects, and shows that it is robust and gets improved accuracy on noisy inputs.

Counterfactual Fairness for Facial Expression Recognition

A systematic investigation of the usage of counterfactuals to achieve both statistical and causal-based fairness in facial expression recognition and shows that the in-processing method outperforms at the pre-processing and post-processing stages, in terms of accuracy, F1 score, statistical fairness andcounterfactual fairness.



Detecting Bias with Generative Counterfactual Face Attribute Augmentation

A simple framework for identifying biases of a smiling attribute classifier is introduced and a set of metrics that measure the effect of manipulating a specific property of an image on the output of a trained classifier are introduced.

Characterizing Bias in Classifiers using Generative Models

This work incorporates a progressive conditional generative model for synthesizing photo-realistic facial images and Bayesian Optimization for an efficient interrogation of independent facial image classification systems and shows how this approach can be used to efficiently characterize racial and gender biases in commercial systems.

Generative Counterfactual Introspection for Explainable Deep Learning

An introspection technique for deep neural networks that relies on a generative model to instigate salient editing of the input image for model interpretation and how to reveal interesting properties of the given classifiers is demonstrated.

Gender Slopes: Counterfactual Fairness for Computer Vision Models by Attribute Manipulation

This work proposes to use an encoder-decoder network developed for image attribute manipulation to synthesize facial images varying in the dimensions of gender and race while keeping other signals intact to measure counterfactual fairness of commercial computer vision classifiers.

Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations

This work introduces a method for efficiently explaining and regularizing differentiable models by examining and selectively penalizing their input gradients, which provide a normal to the decision boundary.

Generating Contrastive Explanations with Monotonic Attribute Functions

This paper proposes a method that can generate contrastive explanations for deep neural networks where aspects that are in themselves sufficient to justify the classification by the deep model are highlighted, but also new aspects which if added will change the classification.

Women also Snowboard: Overcoming Bias in Captioning Models

A new Equalizer model is introduced that ensures equal gender probability when gender Evidence is occluded in a scene and confident predictions when gender evidence is present and has lower error than prior work when describing images with people and mentioning their gender and more closely matches the ground truth ratio of sentences including women to sentences including men.

CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training

It is shown that adversarial training can be used to learn a generative model with true observational and interventional distributions if the generator architecture is consistent with the given causal graph.

Counterfactual Fairness

This paper develops a framework for modeling fairness using tools from causal inference and demonstrates the framework on a real-world problem of fair prediction of success in law school.

Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints

This work proposes to inject corpus-level constraints for calibrating existing structured prediction models and design an algorithm based on Lagrangian relaxation for collective inference to reduce the magnitude of bias amplification in multilabel object classification and visual semantic role labeling.