Do Neural Language Models Overcome Reporting Bias?

  title={Do Neural Language Models Overcome Reporting Bias?},
  author={Vered Shwartz and Yejin Choi},
Mining commonsense knowledge from corpora suffers from reporting bias, over-representing the rare at the expense of the trivial (Gordon and Van Durme, 2013). We study to what extent pre-trained language models overcome this issue. We find that while their generalization capacity allows them to better estimate the plausibility of frequent but unspoken of actions, outcomes, and properties, they also tend to overestimate that of the very rare, amplifying the bias that already exists in their… 

Figures and Tables from this paper

Mitigating Reporting Bias in Semi-supervised Temporal Commonsense Inference with Probabilistic Soft Logic

A novel neural-logic based Soft Logic Enhanced Event Temporal Reasoning (SLEER) model is proposed for acquiring unbiased TCS knowledge, in which the complementary relationship among dimensions are explicitly represented as logic rules and modeled by t-norm fuzzy logics.

A Systematic Investigation of Commonsense Understanding in Large Language Models

It is found that the impressive zeroshot performance of large language models is mostly due to existence of dataset bias in the authors' benchmarks, and that leveraging explicit commonsense knowledge does not yield substantial improvement.

Assessing the Limits of the Distributional Hypothesis in Semantic Spaces: Trait-based Relational Knowledge and the Impact of Co-occurrences

Evaluating how well English and Spanish semantic spaces capture a particular type of relational knowledge, namely the traits associated with concepts, and exploring the role of co-occurrences in this context.

Do Language Models Learn Commonsense Knowledge?

Language models (LMs) trained on large amounts of data (e.g., Brown et al., 2020; Patwary et al., 2021) have shown impressive performance on many NLP tasks under the zero-shot and few-shot setup.

Sentence Selection Strategies for Distilling Word Embeddings from BERT

This paper analyzes a range of strategies for selecting the most informative sentences and shows that with a careful selection strategy, high-quality word vectors can be learned from as few as 5 to 10 sentences.

Measuring and Improving Consistency in Pretrained Language Models

The creation of PARAREL, a high-quality resource of cloze-style query English paraphrases, and analysis of the representational spaces of PLMs suggest that they have a poor structure and are currently not suitable for representing knowledge in a robust way.

Commonsense Knowledge Reasoning and Generation with Pre-trained Language Models: A Survey

A survey of commonsense knowledge acquisition and reasoning tasks, the strengths and weaknesses of state-of-the-art pre-trained models for commonsense reasoning and generation as revealed by these tasks, and reflects on future research directions are presented.

ALL Dolphins Are Intelligent and SOME Are Friendly: Probing BERT for Nouns’ Semantic Properties and their Prototypicality

This study probes BERT for the properties of English nouns as expressed by adjectives that do not restrict the reference scope of the noun they modify, but instead emphasise some inherent aspect (“red strawberry”) and shows that the model has marginal knowledge of these features and their prevalence as expressed in datasets.

Modeling Event Plausibility with Consistent Conceptual Abstraction

This work shows that Transformer-based plausibility models are markedly inconsistent across the conceptual classes of a lexical hierarchy, and presents a simple post-hoc method of forcing model consistency that improves correlation with human plausibility judgements.

Smoothing Entailment Graphs with Language Models

This paper introduces a new method of graph smoothing, using a Language Model to find the nearest approximations of missing predicates, and formalizes a theory for smoothing a symbolic inference method by constructing transitive chains to smooth both the premise and hypothesis.



Reporting bias and knowledge acquisition

This paper questions the idea that the frequency with which people write about actions, outcomes, or properties is a reflection of real-world frequencies or the degree to which a property is characteristic of a class of individuals.

Commonsense Knowledge Mining from Pretrained Models

This work develops a method for generating commonsense knowledge using a large, pre-trained bidirectional language model that can be used to rank a triple’s validity by the estimated pointwise mutual information between the two entities.

Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data

It is argued that a system trained only on form has a priori no way to learn meaning, and a clear understanding of the distinction between form and meaning will help guide the field towards better science around natural language understanding.

What BERT Is Not: Lessons from a New Suite of Psycholinguistic Diagnostics for Language Models

A suite of diagnostics drawn from human language experiments are introduced, which allow us to ask targeted questions about information used by language models for generating predictions in context, and the popular BERT model is applied.

Inducing Relational Knowledge from BERT

This work proposes a methodology for distilling relational knowledge from a pre-trained language model that fine-tune a language model to predict whether a given word pair is likely to be an instance of some relation, when given an instantiated template for that relation as input.

Language Models as Knowledge Bases?

An in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models finds that BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge.

Inverting Grice's Maxims to Learn Rules from Natural Language Extractions

A mention model is introduced that models the probability of facts being mentioned in the text based on what other facts have already been mentioned and domain knowledge in the form of Horn clause rules and must simultaneously search the space of rules and learn the parameters of the mention model.

Barack’s Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling

This work introduces the knowledge graph language model (KGLM), a neural language model with mechanisms for selecting and copying facts from a knowledge graph that are relevant to the context that enable the model to render information it has never seen before, as well as generate out-of-vocabulary tokens.

Unsupervised Commonsense Question Answering with Self-Talk

An unsupervised framework based on self-talk as a novel alternative to multiple-choice commonsense tasks, inspired by inquiry-based discovery learning, which improves performance on several benchmarks and competes with models that obtain knowledge from external KBs.

Probing Neural Language Models for Human Tacit Assumptions

This work constructs a diagnostic set of word prediction prompts to evaluate whether recent neural contextualized language models trained on large text corpora capture STAs, and finds models to be profoundly effective at retrieving concepts given associated properties.