• Corpus ID: 8372620

A survey on measuring indirect discrimination in machine learning

  title={A survey on measuring indirect discrimination in machine learning},
  author={Indrė Žliobaitė},
Nowadays, many decisions are made using predictive models built on historical data. [] Key Method We also discuss related measures from other disciplines, which have not been used for measuring discrimination, but potentially could be suitable for this purpose. We computationally analyze properties of selected measures. We also review and discuss measuring procedures, and present recommendations for practitioners. The primary target audience is data mining, machine learning, pattern recognition, statistical…

Figures and Tables from this paper

Using sensitive personal data may be necessary for avoiding discrimination in data-driven decision models

This paper demonstrates empirically and theoretically with standard regression models that in order to make sure that decision models are non-discriminatory, for instance, with respect to race, the sensitive racial information needs to be used in the model building process.

Using Balancing Terms to Avoid Discrimination in Classification

  • Simon EnniI. Assent
  • Computer Science
    2018 IEEE International Conference on Data Mining (ICDM)
  • 2018
The Balancing Terms (BT) method balances the error rates of any classifier with a differentiable prediction function, and unlike existing work, it can incorporate a preference for the trade-off between fairness and accuracy.

Fairness-Aware Learning with Prejudice Free Representations

A novel algorithm is proposed that can effectively identify and treat latent discriminating features and can be used as a key aid in proving that the model is free of discrimination towards regulatory compliance if the need arises.

Algorithmic Bias: From Discrimination Discovery to Fairness-aware Data Mining

The aim of this tutorial is to survey algorithmic bias, presenting its most common variants, with an emphasis on the algorithmic techniques and key ideas developed to derive efficient solutions.

A Ranking Approach to Fair Classification

This paper focuses on scenarios where only imperfect labels are available and proposes a new fair ranking-based decision system based on monotonic relationships between legitimate features and the outcome, which outperforms traditional classification algorithms.

Fairness-aware machine learning: a perspective

Computer science needs to analyze machine learning process as a whole to systematically explain the roots of discrimination occurrence, which will allow to devise global machine learning optimization criteria for guaranteed prevention, as opposed to pushing empirical constraints into existing algorithms case-by-case.

A Data Scientist’s Guide to Discrimination-Aware Classification Authors:

It is advocated that data scientists should be intentional about modeling and reducing discriminatory outcomes, without doing so, their efforts will result in perpetuating any systemic discrimination that may exist, but under a misleading veil of data-driven objectivity.

Online Decision Trees with Fairness

A novel framework of online decision tree with fairness in the data stream with possible distribution drifting is proposed and two fairness decision tree online growth algorithms that fulfills different online fair decision-making requirements are proposed.

Unfairness Discovery and Prevention For Few-Shot Regression

It is demonstrated that the proposed unfairness discovery and prevention approaches efficiently detect discrimination and mitigate biases on model output as well as generalize both accuracy and fairness to unseen tasks with a limited amount of training samples.

On the Applicability of Machine Learning Fairness Notions

This paper is a survey of fairness notions that addresses the question of "which notion of fairness is most suited to a given real-world scenario and why?".



A Methodology for Direct and Indirect Discrimination Prevention in Data Mining

This paper discusses how to clean training data sets and outsourced data sets in such a way that direct and/or indirect discriminatory decision rules are converted to legitimate (nondiscriminatory) classification rules and proposes new techniques applicable for direct or indirect discrimination prevention individually or both at the same time.

Quantifying explainable discrimination and removing illegal discrimination in automated decision making

The refined notion of conditional non-discrimination in classifier design is introduced and it is shown that some of the differences in decisions across the sensitive groups can be explainable and are hence tolerable.

Discrimination Aware Decision Tree Learning

Experimental evaluation shows that the proposed approach advances the state-of-the-art in the sense that the learned decision trees have a lower discrimination than models provided by previous methods, with little loss in accuracy.

Combating discrimination using Bayesian networks

A discrimination discovery method based on modeling the probability distribution of a class using Bayesian networks and a classification method that corrects for the discovered discrimination without using protected attributes in the decision process are proposed.

Three naive Bayes approaches for discrimination-free classification

Three approaches for making the naive Bayes classifier discrimination-free are presented: modifying the probability of the decision being positive, training one model for every sensitive attribute value and balancing them, and adding a latent variable to the Bayesian model that represents the unbiased label and optimizing the model parameters for likelihood using expectation maximization.

Data mining for discrimination discovery

This article formalizes the processes of direct and indirect discrimination discovery by modelling protected-by-law groups and contexts where discrimination occurs in a classification rule based syntax and proposes two inference models and provides automatic procedures for their implementation.

Fairness-Aware Classifier with Prejudice Remover Regularizer

A regularization approach is proposed that is applicable to any prediction algorithm with probabilistic discriminative models and applied to logistic regression and empirically show its effectiveness and efficiency.

Discrimination-aware data mining

This approach leads to a precise formulation of the redlining problem along with a formal result relating discriminatory rules with apparently safe ones by means of background knowledge, and an empirical assessment of the results on the German credit dataset.

Measuring Discrimination in Socially-Sensitive Decision Records

This work tackles the problem of determining, given a dataset of historical decision records, a precise measure of the degree of discrimination suffered by a given group in a given context with respect to the decision, and introduces a collection of quantitative measures of discrimination.

A study of top-k measures for discrimination discovery

To what extent the sets of top-k ranked rules with respect to any two pairs of measures agree is studied, including risk difference, risk ratio, odds ratio, and few others.