Like Trainer, Like Bot? Inheritance of Bias in Algorithmic Content Moderation

@inproceedings{Binns2017LikeTL,
  title={Like Trainer, Like Bot? Inheritance of Bias in Algorithmic Content Moderation},
  author={Reuben Binns and Michael Veale and M. V. Kleek and N. Shadbolt},
  booktitle={SocInfo},
  year={2017}
}
The internet has become a central medium through which `networked publics' express their opinions and engage in debate. Offensive comments and personal attacks can inhibit participation in these spaces. Automated content moderation aims to overcome this problem using machine learning classifiers trained on large corpora of texts manually annotated for offence. While such systems could help encourage more civil debate, they must navigate inherently normatively contestable boundaries, and are… Expand
Algorithmic content moderation: Technical and political challenges in the automation of platform governance
As government pressure on major technology companies builds, both firms and legislators are searching for technical solutions to difficult platform governance puzzles such as hate speech andExpand
Investigating Annotator Bias with a Graph-Based Approach
TLDR
This study wants to investigate annotator bias — a form of bias that annotators cause due to different knowledge in regards to the task and their subjective perception, and build a graph based on the annotations from the different annotators and apply a community detection algorithm to group the annotators. Expand
Systems for collective human curation of online discussion
  • Amy X. Zhang
  • Sociology
  • 2019
The internet was supposed to democratize discussion, allowing people from all walks of life to communicate with each other at scale. However, this vision has not been fully realized—instead onlineExpand
Towards Equal Gender Representation in the Annotations of Toxic Language Detection
TLDR
This paper finds that the BERT model associates toxic comments containing offensive words with male annotators, causing the model to predict 67.7% of toxic comments as having been annotated by men, and shows that this disparity between gender predictions can be mitigated by removing offensive words and highly toxic comments from the training data. Expand
TAR on Social Media: A Framework for Online Content Moderation
Content moderation (removing or limiting the distribution of posts based on their contents) is one tool social networks use to fight problems such as harassment and disinformation. Manually screeningExpand
Identifying and Measuring Annotator Bias Based on Annotators’ Demographic Characteristics
TLDR
This work investigates annotator bias using classification models trained on data from demographically distinct annotator groups, and shows that demographic features, such as first language, age, and education, correlate with significant performance differences. Expand
Feature-Based Explanations Don't Help People Detect Misclassifications of Online Toxicity
TLDR
It is found that, contrary to expectations, explanations have no significant impact on accuracy or agreement with model predictions, through they do change the distribution of subject error somewhat while reducing the cognitive burden of the task for subjects. Expand
Algorithmic Censorship by Social Platforms: Power and Resistance
TLDR
This analysis shows that algorithmic censorship is distinctive for two reasons: in potentially bringing all communications carried out on social platforms within reach and in potentially allowing those platforms to take a more active, interventionist approach to moderating those communications. Expand
Offensive, aggressive, and hate speech analysis: From data-centric to human-centered approach
TLDR
This work proposed two novel perspectives of perception: group-based and individual, which outperform classic data-centric methods that generalize offensiveness perception and developed requirements for annotation procedures, personalization and content processing to make the solutions human-centered. Expand
Human-Machine Collaboration for Content Regulation
TLDR
There is a need for audit tools to help tune the performance of automated mechanisms, a repository for sharing tools, and improving the division of labor between human and machine decision making. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 38 REFERENCES
Ex Machina: Personal Attacks Seen at Scale
TLDR
A method that combines crowdsourcing and machine learning to analyze personal attacks at scale is developed and illustrated, and an evaluation method for a classifier in terms of the aggregated number of crowd-workers it can approximate is shown. Expand
The Bag of Communities: Identifying Abusive Behavior Online with Preexisting Internet Data
TLDR
It is argued that the BoC approach may allow communities to deal with a range of common problems, like abusive behavior, faster and with fewer engineering resources. Expand
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
TLDR
This work empirically demonstrates that its algorithms significantly reduce gender bias in embeddings while preserving the its useful properties such as the ability to cluster related concepts and to solve analogy tasks. Expand
CommentIQ: Enhancing Journalistic Curation of Online News Comments
TLDR
This talk will present an editorially-aware visual analytics system called CommentIQ that supports moderators in curating high quality news comments at scale that will be discussed in terms of journalistic ideals and norms of free speech and inclusion. Expand
Semantics derived automatically from language corpora contain human-like biases
TLDR
It is shown that machines can learn word associations from written texts and that these associations mirror those learned by humans, as measured by the Implicit Association Test (IAT), and that applying machine learning to ordinary human language results in human-like semantic biases. Expand
Us and them: identifying cyber hate on Twitter across multiple protected characteristics
TLDR
This work uses text parsing to extract typed dependencies, which represent syntactic and grammatical relationships between words, and are shown to capture ‘othering’ language - consistently improving machine classification for different types of cyber hate beyond the use of a Bag of Words and known hateful terms. Expand
Social media as a catalyst for online deliberation? Exploring the affordances of Facebook and YouTube for political expression
TLDR
It is predicted that political discussions in Facebook will present a more egalitarian distribution of comments between discussants and higher level of politeness in their messages, whereas politeness is lower in the more anonymous and deindividuated YouTube. Expand
What is a flag for? Social media reporting tools and the vocabulary of complaint
TLDR
The working of the flag is unpacked, alternatives that give greater emphasis to public deliberation are considered, and the implications for online public discourse of this now commonplace yet rarely studied sociotechnical mechanism are considered. Expand
The politics of ‘platforms’
Online content providers such as YouTube are carefully positioning themselves to users, clients, advertisers and policymakers, making strategic claims for what they do and do not do, and how theirExpand
Certifying and Removing Disparate Impact
TLDR
This work links disparate impact to a measure of classification accuracy that while known, has received relatively little attention and proposes a test for disparate impact based on how well the protected class can be predicted from the other attributes. Expand
...
1
2
3
4
...