• Corpus ID: 246015874

Fair Interpretable Learning via Correction Vectors

  title={Fair Interpretable Learning via Correction Vectors},
  author={Mattia Cerrato and Marius K{\"o}ppel and Alexander Segner and Stefan Kramer},
Neural network architectures have been extensively employed in the fair representation learning setting, where the objective is to learn a new representation for a given vector which is independent of sensitive information. Various “representation debiasing” techniques have been proposed in the literature. However, as neural networks are inherently opaque, these methods are hard to comprehend, which limits their usefulness. We propose a new framework for fair representation learning which is… 

Figures and Tables from this paper


Constraining deep representations with a noise module for fair classification
This paper builds onto a domain adaptation neural model by augmenting it with a "noise conditioning" mechanism which is instrumental in obtaining fair representations and provides experiments showing the effectiveness of the noise conditioning mechanism in helping the networks to ignore the sensible attribute.
This model is based on a variational autoencoding architecture with priors that encourage independence between sensitive and latent factors of variation with an additional penalty term based on the “Maximum Mean Discrepancy” (MMD) measure.
Controllable Invariance through Adversarial Feature Learning
This paper shows that the proposed framework induces an invariant representation, and leads to better generalization evidenced by the improved performance on three benchmark tasks.
Learning Fair Representations
We propose a learning algorithm for fair classification that achieves both group fairness (the proportion of members in a protected group receiving positive classification is identical to the
Invariant Representations without Adversarial Training
It is shown that adversarial training is unnecessary and sometimes counter-productive; this work casts invariant representation learning as a single information-theoretic objective that can be directly optimized.
Fair pairwise learning to rank
A family of fair pairwise learning to rank approaches based on Neural Networks is presented, which are able to produce balanced outcomes for underprivileged groups and, at the same time, build fair representations of data, i.e. new vectors having no correlation with regard to a sensitive attribute.
Deep Residual Learning for Image Recognition
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
Domain-Adversarial Training of Neural Networks
A new representation learning approach for domain adaptation, in which data at training and test time come from similar but different distributions, which can be achieved in almost any feed-forward model by augmenting it with few standard layers and a new gradient reversal layer.
Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment
A new notion of unfairness, disparate mistreatment, is introduced, defined in terms of misclassification rates, which is proposed for decision boundary-based classifiers and can be easily incorporated into their formulation as convex-concave constraints.
Approximation by superpositions of a sigmoidal function
  • G. Cybenko
  • Computer Science
    Math. Control. Signals Syst.
  • 1989
In this paper we demonstrate that finite linear combinations of compositions of a fixed, univariate function and a set of affine functionals can uniformly approximate any continuous function ofn real