Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints

  title={Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints},
  author={Jieyu Zhao and Tianlu Wang and Mark Yatskar and Vicente Ordonez and Kai-Wei Chang},
Language is increasingly being used to define rich visual recognition problems with supporting image collections sourced from the web. [] Key Method We propose to inject corpus-level constraints for calibrating existing structured prediction models and design an algorithm based on Lagrangian relaxation for collective inference. Our method results in almost no performance loss for the underlying recognition task but decreases the magnitude of bias amplification by 47.5% and 40.5% for multilabel classification…

Figures from this paper

Mitigating Gender Bias in Captioning Systems

A new Guided Attention Image Captioning model (GAIC) is proposed which provides self-guidance on visual attention to encourage the model to capture correct gender visual evidence and validate that GAIC can significantly reduce gender prediction errors with a competitive caption quality.

Identifying and Reducing Gender Bias in Word-Level Language Models

This study proposes a metric to measure gender bias and proposes a regularization loss term for the language model that minimizes the projection of encoder-trained embeddings onto an embedding subspace that encodes gender and finds this regularization method to be effective in reducing gender bias.

Gender and Racial Bias in Visual Question Answering Datasets

This work investigates gender and racial bias in five VQA datasets and finds that the distribution of answers is highly different between questions about women and men, as well as the existence of detrimental gender-stereotypical samples.

Women also Snowboard: Overcoming Bias in Captioning Models

A new Equalizer model is introduced that ensures equal gender probability when gender Evidence is occluded in a scene and confident predictions when gender evidence is present and has lower error than prior work when describing images with people and mentioning their gender and more closely matches the ground truth ratio of sentences including women to sentences including men.

To “See” is to Stereotype Image Tagging Algorithms, Gender Recognition, and the Accuracy – Fairness Trade-off

Evaluating five proprietary algorithms for tagging images, it is found that in three, gender inference is hindered when a background is introduced, and it is the one whose output is most consistent with human stereotyping processes that is superior in recognizing gender.

Understanding and Evaluating Racial Biases in Image Captioning

Differences in caption performance, sentiment, and word choice between images of lighter versus darker-skinned people are found to be greater in modern captioning systems compared to older ones, thus leading to concerns that without proper consideration and mitigation these differences will only become increasingly prevalent.

Towards Understanding Gender-Seniority Compound Bias in Natural Language Generation

This work investigates how seniority impacts the degree of gender bias exhibited in pretrained neural generation models by introducing a novel framework for probing compound bias and shows that GPT-2 amplifies bias by considering women as junior and men as senior more often than the ground truth in both domains.

Are Gender-Neutral Queries Really Gender-Neutral? Mitigating Gender Bias in Image Search

Two novel debiasing approaches are introduced: an in-processing fair sampling method to address the gender imbalance issue for training models, and a post-processing feature clipping method base on mutual information to debias multimodal representations of pre-trained models.

Exposing and Correcting the Gender Bias in Image Captioning Datasets and Models

This work investigates gender bias in the COCO captioning dataset and shows that it engenders not only from the statistical distribution of genders with contexts but also from the flawed annotation by the human annotators.

A study on the distribution of social biases in self-supervised learning visual models

It is shown that there is a correlation between the type of the SSL model and the number of biases that it incorporates, and the results suggest that this number does not strictly depend on the model’s accuracy and changes throughout the network.



Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings

This work empirically demonstrates that its algorithms significantly reduce gender bias in embeddings while preserving the its useful properties such as the ability to cluster related concepts and to solve analogy tasks.

Seeing through the Human Reporting Bias: Visual Classifiers from Noisy Human-Centric Labels

This paper proposes an algorithm to decouple the human reporting bias from the correct visually grounded labels, and shows significant improvements over traditional algorithms for both image classification and image captioning, doubling the performance of existing methods in some cases.

Semantics derived automatically from language corpora necessarily contain human biases

It is shown for the first time that human-like semantic biases result from the application of standard machine learning to ordinary language---the same sort of language humans are exposed to every day.

Semantics derived automatically from language corpora contain human-like biases

It is shown that machines can learn word associations from written texts and that these associations mirror those learned by humans, as measured by the Implicit Association Test (IAT), and that applying machine learning to ordinary human language results in human-like semantic biases.

Situation Recognition: Visual Semantic Role Labeling for Image Understanding

This paper introduces situation recognition, the problem of producing a concise summary of the situation an image depicts including: (1) the main activity (e.g., clipping), (2) the participating

Tractable Semi-supervised Learning of Complex Structured Prediction Models

An approximate semi-supervised learning method that uses piecewise training for estimating the model weights and a dual decomposition approach for solving the inference problem of finding the labels of unlabeled data subject to domain specific constraints is proposed.

Constrained Semi-supervised Learning in the Presence of Unanticipated Classes

This thesis argues that many AKBC tasks which have previously been addressed separately can be viewed as instances of single abstract problem: multiview semisupervised learning with an incomplete class hierarchy, and presents a generic EM framework for solving this abstract task.

VQA: Visual Question Answering

We propose the task of free-form and open-ended Visual Question Answering (VQA). Given an image and a natural language question about the image, the task is to provide an accurate natural language

Extracting implicit knowledge from text

This work considers the extraction of knowledge that is conveyed implicitly, both within everyday texts and queries posed to internet search engines, and shows that a significant amount of general knowledge can be gleaned based on how the authors talk about the world.

A survey on measuring indirect discrimination in machine learning

This survey review and organize various discrimination measures that have been used for measuring discrimination in data, as well as in evaluating performance of discrimination-aware predictive models, and computationally analyze properties of selected measures.