Factual Consistency of Multilingual Pretrained Language Models

  title={Factual Consistency of Multilingual Pretrained Language Models},
  author={Constanza Fierro and Anders S{\o}gaard},
Pretrained language models can be queried for factual knowledge, with potential applications in knowledge base acquisition and tasks that require inference. However, for that, we need to know how reliable this knowledge is, and recent work has shown that monolingual English language models lack consistency when predicting factual knowledge, that is, they fill-in-the-blank differently for paraphrases describing the same fact. In this paper, we extend the analysis of consistency to a multilingual… 

Figures and Tables from this paper

Measuring Reliability of Large Language Models through Semantic Consistency

A measure of semantic consistency that allows the comparison of open-ended text outputs is developed that is con-siderably more consistent than traditional metrics embodying lexical consistency, and also correlates with human evaluation of output consistency to a higher degree.



X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models

A code-switching-based method is proposed to improve the ability of multilingual LMs to access knowledge, and its effectiveness on several benchmark languages is verified, to properly handle language variations.

Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained Language Models

This work translates the established benchmarks TREx and GoogleRE into 53 languages and finds that using mBERT as a knowledge base yields varying performance across languages and pooling predictions across languages improves performance.

Measuring and Improving Consistency in Pretrained Language Models

The creation of PARAREL, a high-quality resource of cloze-style query English paraphrases, and analysis of the representational spaces of PLMs suggest that they have a poor structure and are currently not suitable for representing knowledge in a robust way.

How Linguistically Fair Are Multilingual Pre-Trained Language Models?

This work scrutinizes the choices made in previous work, proposes a few different strategies for fair and efficient model selection based on the principles of fairness in economics and social choice theory, and emphasizes Rawlsian fairness.

Accurate, yet inconsistent? Consistency Analysis on Language Understanding Models

It is confirmed that current PLMs are prone to generate inconsistent predictions even for semantically identical inputs, and observed that multi-task training with paraphrase identification tasks is of benefit to improve consistency, increasing the consistency by 13% on average.

Language Models as Knowledge Bases: On Entity Representations, Storage Capacity, and Paraphrased Queries

Three entity representations that allow LMs to handle millions of entities are explored and a detailed case study on paraphrased querying of facts stored in LMs is presented, thereby providing a proof-of-concept that language models can indeed serve as knowledge bases.

Unsupervised Cross-lingual Representation Learning at Scale

It is shown that pretraining multilingual language models at scale leads to significant performance gains for a wide range of cross-lingual transfer tasks, and the possibility of multilingual modeling without sacrificing per-language performance is shown for the first time.

Inducing Relational Knowledge from BERT

This work proposes a methodology for distilling relational knowledge from a pre-trained language model that fine-tune a language model to predict whether a given word pair is likely to be an instance of some relation, when given an instantiated template for that relation as input.

BeliefBank: Adding Memory to a Pre-Trained Language Model for a Systematic Notion of Belief

This work describes two mechanisms to improve belief consistency in the overall system, enabling PTLM-based architectures with a systematic notion of belief to construct a more coherent picture of the world, and improve over time without model retraining.

P-Adapters: Robustly Extracting Factual Information from Language Models with Diverse Prompts

What makes a P-Adapter successful is investigated and it is concluded that access to the LLM’s embeddings of the original natural language prompt, particularly the subject of the entity pair being asked about, is a significant factor.