Measuring Model Biases in the Absence of Ground Truth
@article{Aka2021MeasuringMB, title={Measuring Model Biases in the Absence of Ground Truth}, author={Osman Aka and Ken Burke and Alex Bauerle and Christina Greer and Margaret Mitchell}, journal={Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society}, year={2021} }
The measurement of bias in machine learning often focuses on model performance across identity subgroups (such as man and woman) with respect to groundtruth labels. However, these methods do not directly measure the associations that a model may have learned, for example between labels and identity subgroups. Further, measuring a model's bias requires a fully annotated evaluation dataset which may not be easily available in practice. We present an elegant mathematical solution that tackles both…
Figures and Tables from this paper
13 Citations
Measuring Data
- Computer ScienceArXiv
- 2022
The task of measuring data is identified to quantitatively characterize the composition of machine learning data and datasets to motivate measuring data as a critical component of responsible AI development.
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
- Computer ScienceArXiv
- 2022
This work presents Imagen, a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding, and finds that human raters prefer Imagen over other models in side-by-side comparisons, both in terms of sample quality and image-text alignment.
The Undesirable Dependence on Frequency of Gender Bias Metrics Based on Word Embeddings
- Computer ScienceArXiv
- 2023
This work studies the effect of frequency when measuring female vs. male gender bias with word embedding-based bias quantification methods and proves that the frequency-based effect observed in unshuffled corpora stems from properties of the metric rather than from word associations.
Handling Bias in Toxic Speech Detection: A Survey
- BusinessACM Computing Surveys
- 2023
Detecting online toxicity has always been a challenge due to its inherent subjectivity. Factors such as the context, geography, socio-political climate, and background of the producers and consumers…
Social Norm Bias: Residual Harms of Fairness-Aware Algorithms
- Computer ScienceData Mining and Knowledge Discovery
- 2023
This work characterize Social Norm Bias (SNoB), a subtle but consequen-tial type of algorithmic discrimination that may be exhibited by machine learning models, even when these systems achieve group fairness objectives, by measuring how an algorithm’s predictions are associated with conformity to inferred gender norms.
Fake it till you make it: Learning(s) from a synthetic ImageNet clone
- Computer ScienceArXiv
- 2022
It is shown that with minimal and class-agnostic prompt engineering those ImageNet clones the authors denote as ImageNet-SD are able to close a large part of the gap between models produced by synthetic images and models trained with real images for the several standard classification benchmarks that are considered in this study.
Re-contextualizing Fairness in NLP: The Case of India
- SociologyAACL
- 2022
Recent research has revealed undesirable biases in NLP data and models. However, these efforts focus of social disparities in West, and are not directly portable to other geo-cultural contexts. In…
BERTIN: Efficient Pre-Training of a Spanish Language Model using Perplexity Sampling
- Computer ScienceProces. del Leng. Natural
- 2022
This work experiment with different sampling methods from the Spanish version of mC4, and presents a novel data-centric technique which is named perplexity sampling that enables the pre-training of language models in roughly half the amount of steps and using one third of the data.
The Equity Framework: Fairness Beyond Equalized Predictive Outcomes
- EconomicsArXiv
- 2022
Machine Learning (ML) decision-making algorithms are now widely used in predictive decision-making, for example, to determine who to admit and give a loan. Their wide usage and consequential effects…
Seeing without Looking: Analysis Pipeline for Child Sexual Abuse Datasets
- Computer ScienceFAccT
- 2022
It is argued that automatic signals can highlight important aspects of the overall distribution of data, which is valuable for databases that can not be disclosed.
References
SHOWING 1-10 OF 30 REFERENCES
A NEW MEASURE OF RANK CORRELATION
- Mathematics
- 1938
1. In psychological work the problem of comparing two different rankings of the same set of individuals may be divided into two types. In the first type the individuals have a given order A which is…
Equality of Opportunity in Supervised Learning
- Computer ScienceNIPS
- 2016
This work proposes a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features and shows how to optimally adjust any learned predictor so as to remove discrimination according to this definition.
Word Association Norms, Mutual Information and Lexicography
- LinguisticsACL
- 1989
The proposed measure, the association ratio, estimates word association norms directly from computer readable corpora, making it possible to estimate norms for tens of thousands of words.
The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale
- 2018
Distributional Structure
- Linguistics
- 1954
This discussion will discuss how each language can be described in terms of a distributional structure, i.e. in Terms of the occurrence of parts relative to other parts, and how this description is complete without intrusion of other features such as history or meaning.
Predictive Inequity in Object Detection
- Computer ScienceArXiv
- 2019
This work annotates an existing large scale dataset which contains pedestrians with Fitzpatrick skin tones in ranges [1-3] or [4-6], and provides an in-depth comparative analysis of performance between these two skin tone groupings, finding that neither time of day nor occlusion explain this behavior.
ConvNets and ImageNet Beyond Accuracy: Understanding Mistakes and Uncovering Biases
- Computer ScienceECCV
- 2018
It is experimentally demonstrated that the accuracy and robustness of ConvNets measured on Imagenet are vastly underestimated and that explanations can mitigate the impact of misclassified adversarial examples from the perspective of the end-user.
Women also Snowboard: Overcoming Bias in Captioning Models
- Computer ScienceECCV
- 2018
A new Equalizer model is introduced that ensures equal gender probability when gender Evidence is occluded in a scene and confident predictions when gender evidence is present and has lower error than prior work when describing images with people and mentioning their gender and more closely matches the ground truth ratio of sentences including women to sentences including men.