• Corpus ID: 3972891

Calibration for the (Computationally-Identifiable) Masses

@article{HbertJohnson2017CalibrationFT,
  title={Calibration for the (Computationally-Identifiable) Masses},
  author={{\'U}rsula H{\'e}bert-Johnson and Michael P. Kim and Omer Reingold and Guy N. Rothblum},
  journal={ArXiv},
  year={2017},
  volume={abs/1711.08513}
}
As algorithms increasingly inform and influence decisions made about individuals, it becomes increasingly important to address concerns that these algorithms might be discriminatory. The output of an algorithm can be discriminatory for many reasons, most notably: (1) the data used to train the algorithm might be biased (in various ways) to favor certain populations over others; (2) the analysis of this training data might inadvertently or maliciously introduce biases that are not borne out in… 
Multicalibration: Calibration for the (Computationally-Identifiable) Masses
We develop and study multicalibration as a new measure of fairness in machine learning that aims to mitigate inadvertent or malicious discrimination that is introduced at training time (even from
Multiaccuracy: Black-Box Post-Processing for Fairness in Classification
TLDR
It is proved that MULTIACCURACY-BOOST converges efficiently and it is shown that if the initial model is accurate on an identifiable subgroup, then the post-processed model will be also.
A survey of bias in Machine Learning through the prism of Statistical Parity for the Adult Data Set
TLDR
This paper presents a mathematical framework for the fair learning problem, specifically in the binary classification setting, and proposes to quantify the presence of bias by using the standard Disparate Impact index on the real and well-known Adult income data set.
Tracking and Improving Information in the Service of Fairness
TLDR
This work studies a formal framework for measuring the information content of predictors and shows that increasing information content through refinements improves the downstream selection rules across a wide range of fairness measures.
Fairness in Machine Learning
TLDR
It is shown how causal Bayesian networks can play an important role to reason about and deal with fairness, especially in complex unfairness scenarios, and how optimal transport theory can be leveraged to develop methods that impose constraints on the full shapes of distributions corresponding to different sensitive attributes.
A comparative study of fairness-enhancing interventions in machine learning
TLDR
It is found that fairness-preserving algorithms tend to be sensitive to fluctuations in dataset composition and to different forms of preprocessing, indicating that fairness interventions might be more brittle than previously thought.
Localized Calibration: Metrics and Recalibration
TLDR
This work proposes the local calibration error (LCE), a finegrained calibration metric that spans the gap between fully global and fully individualized calibration, and introduces a localized recalibration method, LoRe, that improves the LCE better than existing recalibrration methods.
Tackling Algorithmic Bias in Neural-Network Classifiers using Wasserstein-2 Regularization
TLDR
This paper introduces a new method to temper the algorithmic bias in Neural-Network based classifiers using a new model, based on the Gâteaux derivatives of the predictions distribution that is algorithmically reasonable and makes it possible to use the authors' regularized loss with standard stochastic gradient-descent strategies.
From Soft Classifiers to Hard Decisions: How fair can we be?
TLDR
There does not exist a general way to post-process a calibrated classifier to equalize protected groups' positive or negative predictive value (PPV or NPV) and this suggests a way to partially evade the impossibility results of Chouldechova and Kleinberg et al., which preclude equalizing all of these measures simultaneously.
Fairness in Machine Learning: A Survey
TLDR
An overview of the different schools of thought and approaches to mitigating (social) biases and increase fairness in the Machine Learning literature is provided, organises approaches into the widely accepted framework of pre-processing, in- processing, and post-processing methods, subcategorizing into a further 11 method areas.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 51 REFERENCES
Algorithmic stability for adaptive data analysis
TLDR
The first upper bounds on the number of samples required to answer more general families of queries, including arbitrary low-sensitivity queries and an important class of optimization queries (alternatively, risk minimization queries), are proved.
Fairer and more accurate, but for whom?
TLDR
A model comparison framework for automatically identifying subgroups in which the differences between models are most pronounced, with a primary focus on identifying sub groups where the models differ in terms of fairness-related quantities such as racial or gender disparities is introduced.
On Fairness and Calibration
TLDR
It is shown that calibration is compatible only with a single error constraint, and that any algorithm that satisfies this relaxation is no better than randomizing a percentage of predictions for an existing classifier.
Preserving Statistical Validity in Adaptive Data Analysis
TLDR
It is shown that, surprisingly, there is a way to estimate an exponential in n number of expectations accurately even if the functions are chosen adaptively, and this gives an exponential improvement over standard empirical estimators that are limited to a linear number of estimates.
Generalization in Adaptive Data Analysis and Holdout Reuse
TLDR
A simple and practical method for reusing a holdout set to validate the accuracy of hypotheses produced by a learning algorithm operating on a training set and it is shown that a simple approach based on description length can also be used to give guarantees of statistical validity in adaptive settings.
The reusable holdout: Preserving validity in adaptive data analysis
TLDR
A new approach for addressing the challenges of adaptivity based on insights from privacy-preserving data analysis is demonstrated, and how to safely reuse a holdout data set many times to validate the results of adaptively chosen analyses is shown.
Agnostic Boosting
We extend the boosting paradigm to the realistic setting of agnostic learning, that is, to a setting where the training sample is generated by an arbitrary (unknown) probability distribution over
Calibrated Structured Prediction
TLDR
The notion of calibration is extended so as to handle various subtleties pertaining to the structured setting, and a simple recalibration method is provided that trains a binary classifier to predict probabilities of interest.
A Confidence-Based Approach for Balancing Fairness and Accuracy
TLDR
A new measure of fairness, called resilience to random bias (RRB), is proposed and demonstrated that RRB distinguishes well between the authors' naive and sensible fairness algorithms, and together with bias and accuracy provides a more complete picture of the fairness of an algorithm.
Private Multiplicative Weights Beyond Linear Queries
TLDR
This work shows how to give accurate and differentially private solutions to exponentially many convex minimization problems on a sensitive dataset.
...
1
2
3
4
5
...