Simultaneous Improvement of ML Model Fairness and Performance by Identifying Bias in Data

@article{Chaudhari2022SimultaneousIO,
  title={Simultaneous Improvement of ML Model Fairness and Performance by Identifying Bias in Data},
  author={B. Chaudhari and Akash Agarwal and Tanmoy Bhowmik},
  journal={ArXiv},
  year={2022},
  volume={abs/2210.13182}
}
Machine learning models built on datasets containing discriminative instances attributed to various underlying factors result in biased and unfair outcomes. It’s a well founded and intuitive fact that existing bias mitigation strategies often sacrifice accuracy in order to ensure fairness. But when AI engine’s prediction is used for decision making which reflects on revenue or operational efficiency such as credit risk modeling, it would be desirable by the business if accuracy can be somehow… 

Tables from this paper

FairGen: Fair Synthetic Data Generation

This paper proposes a pipeline to generate fairer synthetic data independent of the GAN architecture that utilizes a pre-processing algorithm to identify and remove bias inducing samples and claims that while generating synthetic data most GANs amplify bias present in the training data but by removing these bias induce samples,GANs essentially focuses more on real informative samples.

References

SHOWING 1-10 OF 27 REFERENCES

AI Fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias

A new open-source Python toolkit for algorithmic fairness, AI Fairness 360 (AIF360), released under an Apache v2.0 license, to help facilitate the transition of fairness research algorithms for use in an industrial setting and to provide a common framework for fairness researchers to share and evaluate algorithms.

AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias

A new open source Python toolkit for algorithmic fairness, AI Fairness 360 (AIF360), released under an Apache v2.0 license to help facilitate the transition of fairness research algorithms to use in an industrial setting and to provide a common framework for fairness researchers to share and evaluate algorithms.

The Case for Process Fairness in Learning: Feature Selection for Fair Decision Making

The case is made for notions of fairness that are based on the process of decision making rather than on the outcomes, which suggests that process fairness may be achieved with little cost to outcome fairness, but that some loss of accuracy is unavoidable.

A Survey on Bias and Fairness in Machine Learning

This survey investigated different real-world applications that have shown biases in various ways, and created a taxonomy for fairness definitions that machine learning researchers have defined to avoid the existing bias in AI systems.

A comparative study of fairness-enhancing interventions in machine learning

It is found that fairness-preserving algorithms tend to be sensitive to fluctuations in dataset composition and to different forms of preprocessing, indicating that fairness interventions might be more brittle than previously thought.

Counterfactual Fairness

This paper develops a framework for modeling fairness using tools from causal inference and demonstrates the framework on a real-world problem of fair prediction of success in law school.

On formalizing fairness in prediction with machine learning

This article surveys how fairness is formalized in the machine learning literature for the task of prediction and presents these formalizations with their corresponding notions of distributive justice from the social sciences literature.

Equality of Opportunity in Supervised Learning

This work proposes a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features and shows how to optimally adjust any learned predictor so as to remove discrimination according to this definition.

Why Unbiased Computational Processes Can Lead to Discriminative Decision Procedures

This chapter discusses the implicit modeling assumptions made by most data mining algorithms and shows situations in which they are not satisfied and outlines three realistic scenarios in which an unbiased process can lead to discriminatory models.

Inherent Trade-Offs in the Fair Determination of Risk Scores

Some of the ways in which key notions of fairness are incompatible with each other are suggested, and hence a framework for thinking about the trade-offs between them is provided.