Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

  title={Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases},
  author={Boxi Cao and Hongyu Lin and Xianpei Han and Le Sun and Lingyong Yan and M. Liao and Tong Xue and Jin Xu},
  booktitle={Annual Meeting of the Association for Computational Linguistics},
Previous literatures show that pre-trained masked language models (MLMs) such as BERT can achieve competitive factual knowledge extraction performance on some datasets, indicating that MLMs can potentially be a reliable knowledge source. In this paper, we conduct a rigorous study to explore the underlying predicting mechanisms of MLMs over different extraction paradigms. By investigating the behaviors of MLMs, we find that previous decent performance mainly owes to the biased prompts which… 

Calibrating Factual Knowledge in Pretrained Language Models

This work proposes a simple and lightweight method to calibrate factual knowledge in PLMs without re-training from scratch, and shows the calibration effectiveness and efficiency.

Pre-training Language Models with Deterministic Factual Knowledge

The factual knowledge probing experiments indicate that the continuously pre-trained PLMs achieve better robustness in factual knowledge capturing and trying to learn a deterministic relationship with the proposed methods can also help other knowledge-intensive tasks.

KAMEL : Knowledge Analysis with Multitoken Entities in Language Models

This work presents a novel Wikidata-based benchmark dataset, KAMEL, for probing relational knowledge in LMs, and shows that even large language models are far from being able to memorize all varieties of relational knowledge that is usually stored knowledge graphs.

What Has Been Enhanced in my Knowledge-Enhanced Language Model?

This work revisits KI from an information-theoretic view and proposes a new theoretically sound probe called Graph Convolution Simulator (GCS) for KI interpretation, which uses graph attention on the corresponding knowledge graph for interpretation.

DKPLM: Decomposable Knowledge-enhanced Pre-trained Language Model for Natural Language Understanding

A novel KEPLM named DKPLM is proposed that decomposes knowledge injection process of the pre-trained language models in pre-training, fine-tuning and inference stages, which facilitates the applications of KEPLMs in real-world scenarios.

How Pre-trained Language Models Capture Factual Knowledge? A Causal-Inspired Analysis

A causal-inspired analysis quantitatively measures and evaluates the word-level patterns that PLMs depend on to generate the missing words and concludes that the PLMs capture the factual knowledge ineffectively because of depending on the inadequate associations.

Can Prompt Probe Pretrained Language Models? Understanding the Invisible Risks from a Causal View

This paper investigates the prompt-based probing from a causal view, highlights three critical biases which could induce biased results and conclusions, and proposes to conduct debiasing via causal intervention.

P-Adapters: Robustly Extracting Factual Information from Language Models with Diverse Prompts

What makes a P-Adapter successful is investigated and it is concluded that access to the LLM’s embeddings of the original natural language prompt, particularly the subject of the entity pair being asked about, is a significant factor.

SPE: Symmetrical Prompt Enhancement for Fact Probing

This work proposes Symmetrical Prompt Enhancement (SPE), a continuous prompt-based method for factual probing in PLMs that leverages the symmetry of the task by constructing symmetrical prompts for subject and object prediction.

LM-KBC: Knowledge Base Construction from Pre-trained Language Models

The authors present a system that performed task-specific pre-training of BERT, employed prompt decomposition for progressive generation of candidate objects, and use adaptive thresholds for final candidate object selection.



Eliciting Knowledge from Language Models Using Automatically Generated Prompts

The remarkable success of pretrained language models has motivated the study of what kinds of knowledge these models learn during pretraining. Reformulating tasks as fill-in-the-blanks problems

X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models

A code-switching-based method is proposed to improve the ability of multilingual LMs to access knowledge, and its effectiveness on several benchmark languages is verified, to properly handle language variations.

Commonsense Knowledge Mining from Pretrained Models

This work develops a method for generating commonsense knowledge using a large, pre-trained bidirectional language model that can be used to rank a triple’s validity by the estimated pointwise mutual information between the two entities.

Language Models as Knowledge Bases?

An in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models finds that BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge.

Are Pretrained Language Models Symbolic Reasoners over Knowledge?

This is the first study that investigates the causal relation between facts present in training and facts learned by the PLM, and shows that PLMs seem to learn to apply some symbolic reasoning rules correctly but struggle with others, including two-hop reasoning.

How Context Affects Language Models' Factual Predictions

This paper reports that augmenting pre-trained language models in this way dramatically improves performance and that the resulting system, despite being unsupervised, is competitive with a supervised machine reading baseline.

Investigating BERT’s Knowledge of Language: Five Analysis Methods with NPIs

It is concluded that a variety of methods is necessary to reveal all relevant aspects of a model’s grammatical knowledge in a given domain.

How Much Knowledge Can You Pack into the Parameters of a Language Model?

It is shown that this approach scales surprisingly well with model size and outperforms models that explicitly look up knowledge on the open-domain variants of Natural Questions and WebQuestions.

Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning

This paper introduces a new scoring method that casts a plausibility ranking task in a full-text format and leverages the masked language modeling head tuned during the pre-training phase and requires less annotated data than the standard classifier approach to reach equivalent performances.

Inducing Relational Knowledge from BERT

This work proposes a methodology for distilling relational knowledge from a pre-trained language model that fine-tune a language model to predict whether a given word pair is likely to be an instance of some relation, when given an instantiated template for that relation as input.