Gender Bias in Coreference Resolution

@inproceedings{Rudinger2018GenderBI,
  title={Gender Bias in Coreference Resolution},
  author={Rachel Rudinger and Jason Naradowsky and Brian Leonard and Benjamin Van Durme},
  booktitle={NAACL},
  year={2018}
}
We present an empirical study of gender bias in coreference resolution systems. We first introduce a novel, Winograd schema-style set of minimal pair sentences that differ only by pronoun gender. With these “Winogender schemas,” we evaluate and confirm systematic gender bias in three publicly-available coreference resolution systems, and correlate this bias with real-world and textual gender statistics. 

Figures and Tables from this paper

Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods

TLDR
A data-augmentation approach is demonstrated that, in combination with existing word-embedding debiasing techniques, removes the bias demonstrated by rule-based, feature-rich, and neural coreference systems in WinoBias without significantly affecting their performance on existing datasets.

Toward Gender-Inclusive Coreference Resolution

TLDR
Through these studies, conducted on English text, it is confirmed that without acknowledging and building systems that recognize the complexity of gender, the authors build systems that lead to many potential harms.

Toward Gender-Inclusive Coreference Resolution: An Analysis of Gender and Bias Throughout the Machine Learning Lifecycle*

TLDR
It is confirmed that without acknowledging and building systems that recognize the complexity of gender, systems that fail for: quality of service, stereotyping, and over- or under-representation, especially for binary and non-binary trans users.

Collecting a Large-Scale Gender Bias Dataset for Coreference Resolution and Machine Translation

TLDR
Grammatical patterns indicating stereotypical and non-stereotypical gender-role assignments are found in corpora from three domains, resulting in a first large-scale gender bias dataset of 108K diverse real-world English sentences, which lends itself to finetuning a coreference resolution model, finding it mitigates bias on a held out set.

Gender Coreference and Bias Evaluation at WMT 2020

TLDR
This work presents the largest evidence for gender coreference and bias in machine translation in more than 19 systems submitted to the WMT over four diverse target languages: Czech, German, Polish, and Russian.

The Hard-CoRe Coreference Corpus: Removing Gender and Number Cues for Difficult Pronominal Anaphora Resolution

TLDR
A new benchmark task for coreference resolution, Hard-CoRe, that targets common-sense reasoning and world knowledge, and shows empirically that state-of-the art models often fail to capture context and rely only on the antecedents to make a decision.

Recognition of They/Them as Singular Personal Pronouns in Coreference Resolution

TLDR
A new benchmark for coreference resolution systems which evaluates singular personal “they” recognition is introduced which is based on WinoNB schemas and confirms their bias toward resolving “ they” pronouns as plural.

Gender Bias in Contextualized Word Embeddings

TLDR
It is shown that a state-of-the-art coreference system that depends on ELMo inherits its bias and demonstrates significant bias on the WinoBias probing corpus and two methods to mitigate such gender bias are explored.

Mind the GAP: A Balanced Corpus of Gendered Ambiguous Pronouns

TLDR
GAP, a gender-balanced labeled corpus of 8,908 ambiguous pronoun–name pairs sampled, is presented and released to provide diverse coverage of challenges posed by real-world text and shows that syntactic structure and continuous neural models provide promising, complementary cues for approaching the challenge.

Incorporating Subjectivity into Gendered Ambiguous Pronoun (GAP) Resolution using Style Transfer

TLDR
A new evaluation dataset for gender bias in coreference resolution, GAP-Subjective, which increases the coverage of the original GAP dataset by including subjective sentences and outlines the methodology used to create this dataset.
...

References

SHOWING 1-10 OF 26 REFERENCES

Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods

TLDR
A data-augmentation approach is demonstrated that, in combination with existing word-embedding debiasing techniques, removes the bias demonstrated by rule-based, feature-rich, and neural coreference systems in WinoBias without significantly affecting their performance on existing datasets.

Bootstrapping Path-Based Pronoun Resolution

TLDR
This work learns the likelihood of coreference between a pronoun and a candidate noun based on the path in the parse tree between the two entities, and robustly addresses traditional syntactic coreference constraints.

Stanford’s Multi-Pass Sieve Coreference Resolution System at the CoNLL-2011 Shared Task

TLDR
The coreference resolution system submitted by Stanford at the CoNLL-2011 shared task was ranked first in both tracks, with a score of 57.8 in the closed track and 58.3 in the open track.

Learning Anaphoricity and Antecedent Ranking Features for Coreference Resolution

We introduce a simple, non-linear mention-ranking model for coreference resolution that attempts to learn distinct feature representations for anaphoricity detection and antecedent ranking, which we

Social Bias in Elicited Natural Language Inferences

TLDR
The SNLI human-elicitation protocol makes it prone to amplifying bias and stereotypical associations, which is demonstrated statistically and with qualitative examples.

These are not the Stereotypes You are Looking For: Bias and Fairness in Authorial Gender Attribution

TLDR
This work explores the issue of author gender in two datasets of Dutch literary novels using commonly used descriptive and predictive methods, and shows the importance of controlling for variables in the corpus.

Learning Structured Perceptrons for Coreference Resolution with Latent Antecedents and Non-local Features

TLDR
This work investigates different ways of learning structured perceptron models for coreference resolution when using non-local features and beam search and obtains the best results to date on recent shared task data for Arabic, Chinese, and English.

Improving Coreference Resolution by Learning Entity-Level Distributed Representations

TLDR
A neural network based coreference system that produces high-dimensional vector representations for pairs of coreference clusters that learns when combining clusters is desirable and substantially outperforms the current state of the art on the English and Chinese portions of the CoNLL 2012 Shared Task dataset.

Easy Victories and Uphill Battles in Coreference Resolution

TLDR
This work presents a state-of-the-art coreference system that captures various syntactic, discourse, and semantic phenomena implicitly, with a small number of homogeneous feature templates examining shallow properties of mentions, allowing it to win “easy victories” without crafted heuristics.

Learning Global Features for Coreference Resolution

TLDR
RNNs are proposed to be used to learn latent, global representations of entity clusters directly from their mentions, which are especially useful for the prediction of pronominal mentions, and can be incorporated into an end-to-end coreference system that outperforms the state of the art without requiring any additional search.