OSCaR: Orthogonal Subspace Correction and Rectification of Biases in Word Embeddings

@article{Dev2020OSCaROS,
  title={OSCaR: Orthogonal Subspace Correction and Rectification of Biases in Word Embeddings},
  author={Sunipa Dev and Tao Li and J. M. Phillips and Vivek Srikumar},
  journal={ArXiv},
  year={2020},
  volume={abs/2007.00049}
}
Language representations are known to carry stereotypical biases and, as a result, lead to biased predictions in downstream tasks. While existing methods are effective at mitigating biases by linear projection, such methods are too aggressive: they not only remove bias, but also erase valuable information from word embeddings. We develop new measures for evaluating specific information retention that demonstrate the tradeoff between bias removal and information retention. To address this… 

Figures and Tables from this paper

Marked Attribute Bias in Natural Language Inference

A new observation of gender bias in a downstream NLP application: marked attribute bias in natural language inference is presented, and a new postprocessing debiasing scheme for static word embeddings is proposed.

A Visual Tour of Bias Mitigation Techniques for Word Representations

To help understand how various debiasing techniques change the underlying geometry, this tutorial decomposes each technique into interpretable sequences of primitive operations, and study their effect on the word vectors using dimensionality reduction and interactive visual exploration.

Iterative adversarial removal of gender bias in pretrained word embeddings

This paper proposes an iterative and adversarial procedure to remove gender influence from word representations that should otherwise be free of it, while retaining meaningful gender information in words that are inherently charged with gender polarity.

VERB: Visualizing and Interpreting Bias Mitigation Techniques for Word Representations

Word vector embeddings have been shown to contain and amplify biases in data they are extracted from. Consequently, many techniques have been proposed to identify, mitigate, and attenuate these

UNQOVERing Stereotypical Biases via Underspecified Questions

UNQOVER, a general framework to probe and quantify biases through underspecified questions, is presented, showing that a naive use of model scores can lead to incorrect bias estimates due to two forms of reasoning errors: positional dependence and question independence.

The Geometry of Distributed Representations for Better Alignment, Attenuated Bias, and Improved Interpretability

This work addresses some of the problems pertaining to the transparency and interpretability of high-dimensional representations of language representation, including the detection, quantification, and mitigation of socially biased associations in language representation.

An Interactive Visual Demo of Bias Mitigation Techniques for Word Representations From a Geometric Perspective

This demo utilizes interactive visualization to increase the interpretability of a number of state-of-the-art techniques designed to identify, mitigate, and attenuate biases in word representations, in particular, from a geometric perspective.

A Survey on Bias in Deep NLP

Bias is introduced in a formal way and how it has been treated in several networks, in terms of detection and correction, and a strategy to deal with bias in deep NLP is proposed.

A Survey on Bias in Deep NLP

Bias is introduced in a formal way and how it has been treated in several networks, in terms of detection and correction, and a strategy to deal with bias in deep NLP is proposed.

Stereotype and Categorical Bias Evaluation via Differential Cosine Bias Measure

A vast range of Natural Language Processing (NLP) systems that are in use today have direct impact on humans. While machine learning models are expected to automatically infer world knowledge from

References

SHOWING 1-10 OF 38 REFERENCES

On Measuring and Mitigating Biased Inferences of Word Embeddings

A mechanism for measuring stereotypes using the task of natural language inference is designed and a reduction in invalid inferences via bias mitigation strategies on static word embeddings (GloVe), and it is shown that for gender bias, these techniques extend to contextualizedembeddings when applied selectively only to the static components of contextualized embeddeds.

Attenuating Bias in Word Vectors

New simple ways to detect the most stereotypically gendered words in an embedding and remove the bias from them are explored and it is verified how names are masked carriers of gender bias and then used as a tool to attenuate bias in embeddings.

Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them

Word embeddings are widely used in NLP for a vast range of tasks. It was shown that word embeddings derived from text corpora reflect gender biases in society, causing serious concern. Several recent

A General Framework for Implicit and Explicit Debiasing of Distributional Word Vector Spaces

Experimental findings across three embedding methods suggest that the proposed debiasing models are robust and widely applicable: they often completely remove the bias both implicitly and explicitly without degradation of semantic information encoded in any of the input distributional spaces.

VERB: Visualizing and Interpreting Bias Mitigation Techniques for Word Representations

Word vector embeddings have been shown to contain and amplify biases in data they are extracted from. Consequently, many techniques have been proposed to identify, mitigate, and attenuate these

Gender Bias in Contextualized Word Embeddings

It is shown that a state-of-the-art coreference system that depends on ELMo inherits its bias and demonstrates significant bias on the WinoBias probing corpus and two methods to mitigate such gender bias are explored.

Learning Gender-Neutral Word Embeddings

A novel training procedure for learning gender-neutral word embeddings that preserves gender information in certain dimensions of word vectors while compelling other dimensions to be free of gender influence is proposed.

Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings

This work empirically demonstrates that its algorithms significantly reduce gender bias in embeddings while preserving the its useful properties such as the ability to cluster related concepts and to solve analogy tasks.

Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection

This work presents Iterative Null-space Projection (INLP), a novel method for removing information from neural representations based on repeated training of linear classifiers that predict a certain property the authors aim to remove, followed by projection of the representations on their null-space.

Towards Understanding Gender Bias in Relation Extraction

Recent developments in Neural Relation Extraction (NRE) have made significant strides towards Automated Knowledge Base Construction. While much attention has been dedicated towards improvements in