Fair preprocessing: towards understanding compositional fairness of data transformers in machine learning pipeline

  title={Fair preprocessing: towards understanding compositional fairness of data transformers in machine learning pipeline},
  author={Sumon Biswas and Hridesh Rajan},
  journal={Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering},
  • Sumon Biswas, Hridesh Rajan
  • Published 2 June 2021
  • Computer Science
  • Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
In recent years, many incidents have been reported where machine learning models exhibited discrimination among people based on race, sex, age, etc. Research has been conducted to measure and mitigate unfairness in machine learning models. For a machine learning task, it is a common practice to build a pipeline that includes an ordered set of data preprocessing stages followed by a classifier. However, most of the research on fairness has considered a single classifier based prediction task… 

Figures and Tables from this paper

Fairness Testing: A Comprehensive Survey and Analysis of Trends
A comprehensive survey of existing research on fairness testing is provided, collecting 113 papers and analyzing the research focus, trends, promising directions, as well as widely-adopted datasets and open source tools for fairness testing.
A Comprehensive Empirical Study of Bias Mitigation Methods for Software Fairness
A large-scale, comprehensive empirical evaluation of 17 representative bias mitigation methods, evaluated with 12 Machine Learning (ML) performance metrics, 4 fairness metrics, and 24 types of fairness-performance trade-off assessment, applied to 8 widely-adopted benchmark software decision/prediction tasks.
How Robust is your Fair Model? Exploring the Robustness of Diverse Fairness Strategies
To the best of the knowledge, this work is the first to quantitatively evaluate the robustness of fairness optimisation strategies, and can potentially serve as a guideline in choosing the most suitable fairness strategy for various data sets.
Towards data-centric what-if analysis for native machine learning pipelines
The problem of data-centric what-if analysis over whole ML pipelines (including data preparation and feature encoding), propose optimisations that reuse trained models and intermediate data to reduce the runtime of such analysis, and conduct preliminary experiments on three complex example pipelines.
Bias analysis and mitigation in data-driven tools using provenance
A novel approach towards fairness analysis and bias mitigation utilizing the notion of provenance, which was shown to be useful for similar tasks in the context of data and process analyses is suggested.
Software Fairness: An Analysis and Survey
A clear view of the state-of-the-art in software fairness analysis is provided including the need to study intersectional/sequential bias, policy-based bias handling and human-in- the-loop, socio-technical bias mitigation.
Cascaded Debiasing : Studying the Cumulative Effect of Multiple Fairness-Enhancing Interventions
The need for new fairness metrics that account for the impact on different population groups apart from just the disparity between groups is highlighted, and a list of combinations of interventions that perform best for different fairness and utility metrics are offered to aid the design of fair ML pipelines.
An Empirical Study of Modular Bias Mitigators and Ensembles
An open-source library enabling the modular composition of 10 mitigators, 4 ensembles, and their corresponding hyperparameters is built and empirically explored the space of combinations on 13 datasets, including datasets commonly used in fairness literature plus datasets newly curated by the library.
NeuronFair: Interpretable White-Box Fairness Testing through Biased Neuron Identification
NeuronFair is a new DNN fairness testing framework that differs from previous work in several key aspects: inter-pretable - it quantitatively interprets DNNs' fairness violations for the biased decision; effective - it uses the interpretation results to guide the generation of more diverse instances in less time; generic - it can handle both structured and unstructured data.
The Art and Practice of Data Science Pipelines: A Comprehensive Study of Data Science Pipelines In Theory, In-The-Small, and In-The-Large
This work presents a three-pronged comprehensive study to answer three representations of data science pipelines that capture the essence of the authors' subjects in theory, in the small, and in-the-large.


Towards Explaining the Effects of Data Preprocessing on Machine Learning
This work defines a simple metric, which is called volatility, to measure the effect of including/excluding a specific step on predictions made by the resulting model, with the ultimate goal of identifying predictors for volatility that are independent of the dataset and of the specific preprocessing step.
Measuring and Mitigating Unintended Bias in Text Classification
A new approach to measuring and mitigating unintended bias in machine learning models is introduced, using a set of common demographic identity terms as the subset of input features on which to measure bias.
Fairness testing: testing software for discrimination
It is demonstrated that fairness testing is a critical aspect of the software development cycle in domains with possible discrimination and initial tools for measuring software discrimination are provided.
Fairness Constraints: Mechanisms for Fair Classification
This paper introduces a flexible mechanism to design fair classifiers by leveraging a novel intuitive measure of decision boundary (un)fairness, and shows on real-world data that this mechanism allows for a fine-grained control on the degree of fairness, often at a small cost in terms of accuracy.
False Positives, False Negatives, and False Analyses: A Rejoinder to "Machine Bias: There's Software Used across the Country to Predict Future Criminals. and It's Biased against Blacks"
PROPUBLICA RECENTLY RELEASED a much-heralded investigative report claim­ ing that a risk assessment tool (known as the COMPAS) used in criminal justice is biased against black defendants.12 The
Fair prediction with disparate impact: A study of bias in recidivism prediction instruments
It is demonstrated that the criteria cannot all be simultaneously satisfied when recidivism prevalence differs across groups, and how disparate impact can arise when an RPI fails to satisfy the criterion of error rate balance.
Certifying and Removing Disparate Impact
This work links disparate impact to a measure of classification accuracy that while known, has received relatively little attention and proposes a test for disparate impact based on how well the protected class can be predicted from the other attributes.
German Credit Dataset: UCI Machine Learning Repository
  • 1994
Do the machine learning models on a crowd sourced platform exhibit bias? an empirical study on model fairness
This study has created a benchmark of 40 top-rated models from Kaggle used for 5 different tasks, and using a comprehensive set of fairness metrics, evaluated their fairness and applied 7 mitigation techniques and analyzed the fairness, mitigation results, and impacts on performance.
Fairness-Aware Instrumentation of Preprocessing~Pipelines for Machine Learning
Fair-DAGs, an open-source library that extracts directed acyclic graph (DAG) representations of the data flow in preprocessing pipelines for ML, is proposed, and instruments the pipelines with tracing and visualization code to capture changes in data distributions and identify distortions with respect to protected group membership as the data travels through the pipeline.