Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

  title={Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases},
  author={Boxi Cao and Hongyu Lin and Xianpei Han and Le Sun and Lingyong Yan and M. Liao and Tong Xue and Jin Xu},
Previous literatures show that pre-trained masked language models (MLMs) such as BERT can achieve competitive factual knowledge extraction performance on some datasets, indicating that MLMs can potentially be a reliable knowledge source. In this paper, we conduct a rigorous study to explore the underlying predicting mechanisms of MLMs over different extraction paradigms. By investigating the behaviors of MLMs, we find that previous decent performance mainly owes to the biased prompts which… 

DKPLM: Decomposable Knowledge-enhanced Pre-trained Language Model for Natural Language Understanding

A novel KEPLM named DKPLM is proposed that decomposes knowledge injection process of the pre-trained language models in pre-training, fine-tuning and inference stages, which facilitates the applications of KEPLMs in real-world scenarios.

How Pre-trained Language Models Capture Factual Knowledge? A Causal-Inspired Analysis

A causal-inspired analysis quantitatively measures and evaluates the word-level patterns that PLMs depend on to generate the missing words and concludes that the PLMs capture the factual knowledge ineffectively because of depending on the inadequate associations.

Can Prompt Probe Pretrained Language Models? Understanding the Invisible Risks from a Causal View

This paper investigates the prompt-based probing from a causal view, highlights three critical biases which could induce biased results and conclusions, and proposes to conduct debiasing via causal intervention.

P-Adapters: Robustly Extracting Factual Information from Language Models with Diverse Prompts

What makes a P-Adapter successful is investigated and it is concluded that access to the LLM’s embeddings of the original natural language prompt, particularly the subject of the entity pair being asked about, is a significant factor.

LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection

To effectively incorporate an external KG, triples are transferred into text and a late injection mechanism is proposed and this paper addresses VQA as a text generation task with an effective encoder-decoder paradigm.

A Survey of Knowledge Enhanced Pre-trained Models

A comprehensive overview of KEPTMs in NLP and CV is provided and the progress of pre-trained models and knowledge representation learning is introduced.

Neural Knowledge Bank for Pretrained Transformers

A Neural Knowledge Bank (NKB) and a knowledge in- jection strategy to introduce extra factual knowledge for pretrained Transformers and the interpretability of the NKB is thoroughly analyzed and reveal the meaning of its keys and values in a human-readable way.

Mention Memory: incorporating textual knowledge into Transformers through entity mention attention

The proposed model - TOME - is a Transformer that accesses the information through internal memory layers in which each entity mention in the input passage attends to the mention memory, which enables synthesis of and reasoning over many disparate sources of information within a single Transformer model.

Do Prompt-Based Models Really Understand the Meaning of Their Prompts?

It is found that models can learn just as fast with many prompts that are intentionally irrelevant or even pathologically misleading as they do with instructively “good” prompts, and instruction-tuned models often produce good predictions with irrelevant and misleading prompts even at zero shots.

DeepStruct: Pretraining of Language Models for Structure Prediction

It is shown that a 10B parameter language model transfers non-trivially to most tasks and obtains state-of-the-art performance on 21 of 28 datasets that are evaluated.



Eliciting Knowledge from Language Models Using Automatically Generated Prompts

The remarkable success of pretrained language models has motivated the study of what kinds of knowledge these models learn during pretraining. Reformulating tasks as fill-in-the-blanks problems

Commonsense Knowledge Mining from Pretrained Models

This work develops a method for generating commonsense knowledge using a large, pre-trained bidirectional language model that can be used to rank a triple’s validity by the estimated pointwise mutual information between the two entities.

Language Models as Knowledge Bases?

An in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models finds that BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge.

Are Pretrained Language Models Symbolic Reasoners over Knowledge?

This is the first study that investigates the causal relation between facts present in training and facts learned by the PLM, and shows that PLMs seem to learn to apply some symbolic reasoning rules correctly but struggle with others, including two-hop reasoning.

How Much Knowledge Can You Pack into the Parameters of a Language Model?

It is shown that this approach scales surprisingly well with model size and outperforms models that explicitly look up knowledge on the open-domain variants of Natural Questions and WebQuestions.

Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning

This paper introduces a new scoring method that casts a plausibility ranking task in a full-text format and leverages the masked language modeling head tuned during the pre-training phase and requires less annotated data than the standard classifier approach to reach equivalent performances.

Inducing Relational Knowledge from BERT

This work proposes a methodology for distilling relational knowledge from a pre-trained language model that fine-tune a language model to predict whether a given word pair is likely to be an instance of some relation, when given an instantiated template for that relation as input.

Birds Have Four Legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-trained Language Models

Investigating whether and to what extent one can induce numerical commonsense knowledge from PTLMs as well as the robustness of this process finds that this may not work for numerical Commonsense knowledge.

Language Models are Open Knowledge Graphs

This paper shows how to construct knowledge graphs (KGs) from pre-trained language models (e.g., BERT, GPT-2/3), without human supervision, and proposes an unsupervised method to cast the knowledge contained within language models into KGs.

What BERT Is Not: Lessons from a New Suite of Psycholinguistic Diagnostics for Language Models

A suite of diagnostics drawn from human language experiments are introduced, which allow us to ask targeted questions about information used by language models for generating predictions in context, and the popular BERT model is applied.