Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection
@article{Sap2021AnnotatorsWA, title={Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection}, author={Maarten Sap and Swabha Swayamdipta and Laura Vianna and Xuhui Zhou and Yejin Choi and Noah A. Smith}, journal={ArXiv}, year={2021}, volume={abs/2111.07997} }
The perceived toxicity of language can vary based on someone’s identity and beliefs, but this variation is often ignored when collecting toxic language datasets, resulting in dataset and model biases. We seek to understand the *who*, *why*, and *what* behind biases in toxicity annotations. In two online studies with demographically and politically diverse participants, we investigate the effect of annotator identities (*who*) and beliefs (*why*), drawing from social psychology research about…
Figures and Tables from this paper
47 Citations
Assessing Annotator Identity Sensitivity via Item Response Theory: A Case Study in a Hate Speech Corpus
- Computer ScienceFAccT
- 2022
This work utilizes item response theory (IRT), a methodological approach developed for measurement theory, to quantify annotator identity sensitivity, and uses three different IRT techniques to assess whether an annotator’s racial identity is associated with their ratings on comments that target different racial identities.
Everyone's Voice Matters: Quantifying Annotation Disagreement Using Demographic Information
- Computer Science
- 2023
The paper aims to improve the annotation process for more efficient and inclusive NLP sys- tems through a novel disagreement prediction mechanism and shows that knowing annotators’ demographic information, like gender, ethnicity, and education level, helps predict disagreements.
Noise Audits Improve Moral Foundation Classification
- Computer ScienceArXiv
- 2022
Two metrics to audit the noise of annotations are proposed and experiments show that removing noisy annotations based on the proposed metrics improves classification performance.
Towards Identifying Social Bias in Dialog Systems: Frame, Datasets, and Benchmarks
- Computer ScienceArXiv
- 2022
This paper proposes a novel DIALBIAS FRAME for analyzing the social bias in conversations pragmatically, which considers more comprehensive bias-related analyses rather than simple dichotomy annotations, and introduces CDAIL-BIAS DATASET that is the first well-annotated Chinese social bias dialog dataset.
Impact of Annotator Demographics on Sentiment Dataset Labeling
- Computer ScienceProceedings of the ACM on Human-Computer Interaction
- 2022
It is shown that demographic differences among annotators impute a significant effect on their ratings, and that these effects also occur in each component modality of multimodal sentiment data and its component modalities.
Estimating Ground Truth in a Low-labelled Data Regime: A Study of Racism Detection in Spanish
- Computer ScienceICWSM Workshops
- 2022
This study analyses a new dataset for detecting racism in Spanish, focusing on solving a ground truth estimate given a few labels and high disagreement, and shows better performance at lower thresholds for classifying messages as racist.
The Measuring Hate Speech Corpus: Leveraging Rasch Measurement Theory for Data Perspectivism
- Computer ScienceNLPERSPECTIVES
- 2022
The Measuring Hate Speech corpus, a dataset created to measure hate speech while adjusting for annotators’ perspectives, is introduced, facilitating analyses of interactions between annotator- and comment-level identities, i.e. identity-related annotator perspective.
Addressing religious hate online: from taxonomy creation to automated detection
- Computer SciencePeerJ Computer Science
- 2022
A fine-grained labeling scheme for religious hate speech detection that lies on a wider and highly-interoperable taxonomy of abusive language, and covers the three main monotheistic religions: Judaism, Christianity and Islam is proposed.
Cascading Biases: Investigating the Effect of Heuristic Annotation Strategies on Data and Models
- BiologyArXiv
- 2022
This work proposes tracking annotator heuristic traces, where it is suggested that tracking heuristic usage among annotators can potentially help with collecting challenging datasets and diagnosing model biases.
Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection
- Computer ScienceArXiv
- 2022
It is argued that more care is needed to construct training corpora for language models with better transparency and justification for the inclusion or exclusion of various texts, and that privileging any corpus as high quality entails a language ideology.
References
SHOWING 1-10 OF 114 REFERENCES
Hatred is in the Eye of the Annotator: Hate Speech Classifiers Learn Human-Like Social Stereotypes
- PsychologyCogSci
- 2020
The results demonstrate that hate speech classifiers learn human-like biases which can further perpetuate social inequalities when propagated at scale, and provide insights into additional sources of bias in hate speech moderation, informing ongoing debates regarding fairness in machine learning.
Whose Opinions Matter? Perspective-aware Models to Identify Opinions of Hate Speech Victims in Abusive Language Detection
- Computer ScienceArXiv
- 2021
An in-depth study to model polarized opinions coming from different communities under the hypothesis that similar characteristics can influence the perspectives of annotators on a certain phenomenon, and how this approach improves the prediction performance of a state-of-the-art supervised classifier.
The Risk of Racial Bias in Hate Speech Detection
- Computer ScienceACL
- 2019
This work proposes *dialect* and *race priming* as ways to reduce the racial bias in annotation, showing that when annotators are made explicitly aware of an AAE tweet’s dialect they are significantly less likely to label the tweet as offensive.
Identifying and Measuring Annotator Bias Based on Annotators’ Demographic Characteristics
- Computer ScienceALW
- 2020
This work investigates annotator bias using classification models trained on data from demographically distinct annotator groups, and shows that demographic features, such as first language, age, and education, correlate with significant performance differences.
Social Bias Frames: Reasoning about Social and Power Implications of Language
- PsychologyACL
- 2020
It is found that while state-of-the-art neural models are effective at high-level categorization of whether a given statement projects unwanted social bias, they are not effective at spelling out more detailed explanations in terms of Social Bias Frames.
Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets
- Computer ScienceEMNLP
- 2019
It is shown that model performance improves when training with annotator identifiers as features, and that models are able to recognize the most productive annotators and that often models do not generalize well to examples from annotators that did not contribute to the training set.
Political psycholinguistics: A comprehensive analysis of the language habits of liberal and conservative social media users.
- PsychologyJournal of personality and social psychology
- 2020
For nearly a century social scientists have sought to understand left-right ideological differences in values, motives, and thinking styles. Much progress has been made, but-as in other areas of…
Language (Technology) is Power: A Critical Survey of “Bias” in NLP
- PsychologyACL
- 2020
A greater recognition of the relationships between language and social hierarchies is urged, encouraging researchers and practitioners to articulate their conceptualizations of “bias” and to center work around the lived experiences of members of communities affected by NLP systems.
Ground-Truth, Whose Truth? - Examining the Challenges with Annotating Toxic Text Datasets
- Computer ScienceArXiv
- 2021
Re-annotate samples from three toxic text datasets and find that a multi-label approach to annotating toxic text samples can help to improve dataset quality and capture dependence on context and diversity in annotators.
Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter
- Computer ScienceNLP+CSS@EMNLP
- 2016
It is found that amateur annotators are more likely than expert annotators to label items as hate speech, and that systems training on expert annotations outperform systems trained on amateur annotations.