UNQOVERing Stereotypical Biases via Underspecified Questions

@inproceedings{Li2020UNQOVERingSB,
  title={UNQOVERing Stereotypical Biases via Underspecified Questions},
  author={Tao Li and Daniel Khashabi and Tushar Khot and Ashish Sabharwal and Vivek Srikumar},
  booktitle={FINDINGS},
  year={2020}
}
While language embeddings have been shown to have stereotyping biases, how these biases affect downstream question answering (QA) models remains unexplored. We present UNQOVER, a general framework to probe and quantify biases through underspecified questions. We show that a naive use of model scores can lead to incorrect bias estimates due to two forms of reasoning errors: positional dependence and question independence. We design a formalism that isolates the aforementioned errors. As case… 
BBQ: A Hand-Built Bias Benchmark for Question Answering
TLDR
The Bias Benchmark for QA (BBQ), a dataset consisting of question-sets constructed by the authors that highlight attested social biases against people belonging to protected classes along nine different social dimensions relevant for U.S. English-speaking contexts, is introduced.
Eliciting Bias in Question Answering Models through Ambiguity
Question answering (QA) models use retriever and reader systems to answer questions. Reliance on training data by QA systems can amplify or reflect inequity through their responses. Many QA models,
Lawyers are Dishonest? Quantifying Representational Harms in Commonsense Knowledge Resources
TLDR
It is found that ConceptNet contains severe biases and disparities across four demographic categories and a filtered-based bias-mitigation approach can reduce the issues in both resource and models but leads to a performance drop, leaving room for future work to build fairer and stronger commonsense models.
Evaluating Debiasing Techniques for Intersectional Biases
TLDR
It is argued that a truly fair model must consider ‘gerrymandering’ groups which comprise not only single attributes, but also intersectional groups, and an extension of the iterative nullspace projection technique which can handle multiple protected attributes is evaluated.
What do Bias Measures Measure?
TLDR
This work presents a comprehensive survey of existing bias measures in NLP as a function of the associated NLP tasks, metrics, datasets, and social biases and corresponding harms and proposes a documentation standard for bias measures to aid their development, categorization, and appropriate usage.
General-Purpose Question-Answering with Macaw
TLDR
MACAW, a versatile, generative question-answering (QA) system that is built on UnifiedQA, and exhibits strong performance, zero-shot, on a wide variety of topics, including outperforming GPT-3 by over 10% (absolute) on Challenge300.
Analyzing Stereotypes in Generative Text Inference Tasks
TLDR
This work studies how stereotypes manifest when the potential targets of stereotypes are situated in real-life, neutral contexts and collects human judgments on the presence of stereotypes in generated inferences, and compares how perceptions of stereotypes vary due to annotator positionality.
Ethical-Advice Taker: Do Language Models Understand Natural Language Interventions?
TLDR
This work proposes a new language understanding task, Linguistic Ethical Interventions (LEI), where the goal is to amend a questionanswering (QA) model’s unethical behavior by communicating context-specific principles of ethics and equity to it.
Toward Deconfounding the Influence of Entity Demographics for Question Answering Accuracy
TLDR
Model accuracy analysis reveals little evidence that accuracy is lower for people based on gender or nationality; instead, there is more variation on professions (question topic).
Team JARS: DialDoc Subtask 1 - Improved Knowledge Identification with Supervised Out-of-Domain Pretraining
In this paper, we discuss our submission for DialDoc subtask 1. The subtask requires systems to extract knowledge from FAQ-type documents vital to reply to a user’s query in a conversational setting.
...
1
2
...

References

SHOWING 1-10 OF 36 REFERENCES
StereoSet: Measuring stereotypical bias in pretrained language models
TLDR
StereoSet is presented, a large-scale natural dataset in English to measure stereotypical biases in four domains: gender, profession, race, and religion, and it is shown that popular models like BERT, GPT-2, RoBERTa, and XLNet exhibit strong stereotypical biases.
Assessing Social and Intersectional Biases in Contextualized Word Representations
TLDR
Evaluating bias effects at the contextual word level captures biases that are not captured at the sentence level, confirming the need for this novel approach.
Quantifying Social Biases in Contextual Word Representations
TLDR
A template-based method to quantify bias in BERT is proposed and it is shown that this method obtains more consistent results in capturing social biases than the traditional cosine based method.
OSCaR: Orthogonal Subspace Correction and Rectification of Biases in Word Embeddings
TLDR
OSCaR (Orthogonal Subspace Correction and Rectification), a bias-mitigating method that focuses on disentangling biased associations between concepts instead of removing concepts wholesale, is proposed.
On Measuring and Mitigating Biased Inferences of Word Embeddings
TLDR
A mechanism for measuring stereotypes using the task of natural language inference is designed and a reduction in invalid inferences via bias mitigation strategies on static word embeddings (GloVe), and it is shown that for gender bias, these techniques extend to contextualizedembeddings when applied selectively only to the static components of contextualized embeddeds.
Hurtful words: quantifying biases in clinical contextual word embeddings
TLDR
This work pretrain deep embedding models (BERT) on medical notes from the MIMIC-III hospital dataset, and quantify potential disparities using two approaches to identify dangerous latent relationships that are captured by the contextual word embeddings.
Identifying and Reducing Gender Bias in Word-Level Language Models
TLDR
This study proposes a metric to measure gender bias and proposes a regularization loss term for the language model that minimizes the projection of encoder-trained embeddings onto an embedding subspace that encodes gender and finds this regularization method to be effective in reducing gender bias.
Gender Bias in Contextualized Word Embeddings
TLDR
It is shown that a state-of-the-art coreference system that depends on ELMo inherits its bias and demonstrates significant bias on the WinoBias probing corpus and two methods to mitigate such gender bias are explored.
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
TLDR
This work empirically demonstrates that its algorithms significantly reduce gender bias in embeddings while preserving the its useful properties such as the ability to cluster related concepts and to solve analogy tasks.
The Woman Worked as a Babysitter: On Biases in Language Generation
TLDR
The notion of the regard towards a demographic is introduced, the varying levels of regard towards different demographics are used as a defining metric for bias in NLG, and the extent to which sentiment scores are a relevant proxy metric for regard is analyzed.
...
1
2
3
4
...