“You Are Grounded!”: Latent Name Artifacts in Pre-trained Language Models

Pre-trained language models (LMs) may perpetuate biases originating in their training corpus to downstream models. We focus on artifacts associated with the representation of given names (e.g., Donald), which, depending on the corpus, may be associated with specific entities, as indicated by next token prediction (e.g., Trump). While helpful in some contexts, grounding happens also in under-specified or inappropriate contexts. For example, endings generated for `Donald is a' substantially… 

Societal Biases in Language Generation: Progress and Challenges

A survey on societal biases in language generation is presented, focusing on how data and techniques contribute to biases and progress towards reducing biases, and the effects of decoding techniques.

On Gender Biases in Offensive Language Classification Models

We explore whether neural Natural Language Processing models trained to identify offensive language in tweets contain gender biases. We add historically gendered and gender ambiguous American names

Do Neural Language Models Overcome Reporting Bias?

It is found that while pre-trained language models' generalization capacity allows them to better estimate the plausibility of frequent but unspoken of actions, outcomes, and properties, they also tend to overestimate that of the very rare, amplifying the bias that already exists in their training corpus.

Multilingual Coreference Resolution in Multiparty Dialogue

A large-scale dataset based on TV transcripts, Multilingual Multiparty Coref (MMC), is created, with success both using it for data augmentation and training from scratch, which effectively simulates the zero-shot cross-lingual setting.

Recognition of They/Them as Singular Personal Pronouns in Coreference Resolution

A new benchmark for coreference resolution systems which evaluates singular personal “they” recognition is introduced which is based on WinoNB schemas and confirms their bias toward resolving “ they” pronouns as plural.

Measuring and Mitigating Name Biases in Neural Machine Translation

This paper proposes a simple but effective data augmentation method based on randomly switching entities during translation, which effectively eliminates the problem without any effect on translation quality.

Richer Countries and Richer Representations

We examine whether some countries are more richly represented in embedding space than others. We find that countries whose names occur with low frequency in training corpora are more likely to be

Identifying and Measuring Token-Level Sentiment Bias in Pre-trained Language Models with Prompts

Two token-level sentiment tests are proposed: Sentiment Association Test (SAT) and Sentiment Shift test (SST) which utilize the prompt as a probe to detect the latent bias in the PLMs and suggest that prompting can possibly augment the existing bias in PLMs.

Does entity abstraction help generative Transformers reason?

The results suggest that the benefit of explicit abstraction is significant in formally defined logical reasoning settings requiring many reasoning hops, but point to the notion that it is less beneficial for NLP tasks having less formal logical structure.

On the Robustness of Reading Comprehension Models to Entity Renaming

A pipeline is presented to automatically generate test examples at scale, by replacing entity names in the original test sample with names from a variety of sources, and it is found that entity-based masking can improve the robustness of MRC models.



Perturbation Sensitivity Analysis to Detect Unintended Model Biases

A generic evaluation framework, Perturbation Sensitivity Analysis, is proposed, which detects unintended model biases related to named entities, and requires no new annotations or corpora to be employed.

Improving Machine Reading Comprehension with General Reading Strategies

Three general strategies aimed to improve non-extractive machine reading comprehension (MRC) are proposed and the effectiveness of these proposed strategies and the versatility and general applicability of fine-tuned models that incorporate these strategies are demonstrated.

The Curious Case of Neural Text Degeneration

By sampling text from the dynamic nucleus of the probability distribution, which allows for diversity while effectively truncating the less reliable tail of the distribution, the resulting text better demonstrates the quality of human text, yielding enhanced diversity without sacrificing fluency and coherence.

