Stepmothers are mean and academics are pretentious: What do pretrained language models learn about you?
@article{Choenni2021StepmothersAM, title={Stepmothers are mean and academics are pretentious: What do pretrained language models learn about you?}, author={Rochelle Choenni and Ekaterina Shutova and Robert van Rooij}, journal={ArXiv}, year={2021}, volume={abs/2109.10052} }
In this paper, we investigate what types of stereotypical information are captured by pretrained language models. We present the first dataset comprising stereotypical attributes of a range of social groups and propose a method to elicit stereotypes encoded by pretrained language models in an unsupervised fashion. Moreover, we link the emergent stereotypes to their manifestation as basic emotions as a means to study their emotional effects in a more generalized manner. To demonstrate how our…
Figures and Tables from this paper
3 Citations
StereoKG: Data-Driven Knowledge Graph Construction For Cultural Knowledge and Stereotypes
- Computer ScienceWOAH
- 2022
This study presents a fully data-driven pipeline for generating a knowledge graph of cultural knowledge and stereotypes and shows that performing intermediate masked language model training on the verbalized KG leads to a higher level of cultural awareness in the model and has the potential to increase classification performance on knowledge-crucial samples on a related task, i.e., hate speech detection.
Pipelines for Social Bias Testing of Large Language Models
- Computer ScienceBIGSCIENCE
- 2022
This short paper suggests how to use verification techniques in development pipelines by taking inspiration from software testing and addressing social bias evaluation as software testing.
The Birth of Bias: A case study on the evolution of gender bias in an English language model
- Computer ScienceGEBNLP
- 2022
It is found that the representation of gender is dynamic and identify different phases during training, and it is shown that gender information is represented increasingly locally in the input embeddings of the model and that debiasing these can be effective in reducing the downstream bias.
References
SHOWING 1-10 OF 54 REFERENCES
StereoSet: Measuring stereotypical bias in pretrained language models
- Computer ScienceACL
- 2021
StereoSet, a large-scale natural English dataset to measure stereotypical biases in four domains: gender, profession, race, and religion, is presented and it is shown that popular models like BERT, GPT-2, RoBERTa, and XLnet exhibit strong stereotypical biases.
Stereotype and Skew: Quantifying Gender Bias in Pre-trained and Fine-tuned Language Models
- Computer ScienceEACL
- 2021
This paper proposes two intuitive metrics, skew and stereotype, that quantify and analyse the gender bias present in contextual language models when tackling the WinoBias pronoun resolution task. We…
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models
- PsychologyEMNLP
- 2020
It is found that all three of the widely-used MLMs the authors evaluate substantially favor sentences that express stereotypes in every category in CrowS-Pairs, a benchmark for measuring some forms of social bias in language models against protected demographic groups in the US.
Measuring Bias in Contextualized Word Representations
- Computer ScienceProceedings of the First Workshop on Gender Bias in Natural Language Processing
- 2019
A template-based method to quantify bias in BERT is proposed and it is shown that this method obtains more consistent results in capturing social biases than the traditional cosine based method.
On Measuring Social Biases in Sentence Encoders
- Computer ScienceNAACL
- 2019
The Word Embedding Association Test is extended to measure bias in sentence encoders and mixed results including suspicious patterns of sensitivity that suggest the test’s assumptions may not hold in general.
Multilingual Contextual Affective Analysis of LGBT People Portrayals in Wikipedia
- SociologyICWSM
- 2021
The results show systematic differences in how the LGBT community is portrayed across languages, surfacing cultural differences in narratives and signs of social biases.
Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods
- Computer ScienceNAACL
- 2018
A data-augmentation approach is demonstrated that, in combination with existing word-embedding debiasing techniques, removes the bias demonstrated by rule-based, feature-rich, and neural coreference systems in WinoBias without significantly affecting their performance on existing datasets.
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
- Computer ScienceNIPS
- 2016
This work empirically demonstrates that its algorithms significantly reduce gender bias in embeddings while preserving the its useful properties such as the ability to cluster related concepts and to solve analogy tasks.
The Risk of Racial Bias in Hate Speech Detection
- Computer ScienceACL
- 2019
This work proposes *dialect* and *race priming* as ways to reduce the racial bias in annotation, showing that when annotators are made explicitly aware of an AAE tweet’s dialect they are significantly less likely to label the tweet as offensive.