Plug-and-Play Adaptation for Continuously-updated QA

  title={Plug-and-Play Adaptation for Continuously-updated QA},
  author={Kyungjae Lee and Wookje Han and Seung-won Hwang and Hwaran Lee and Joonsuk Park and Sang-Woo Lee},
Language models (LMs) have shown great potential as implicit knowledge bases (KBs). And for their practical use, knowledge in LMs need to be updated periodically. However, existing tasks to assess LMs’ efficacy as KBs do not adequately consider multiple large-scale updates. To this end, we first propose a novel task—Continuously-updated QA (CuQA)—in which multiple large-scale updates are made to LMs, and the performance is measured with respect to the success in adding and updating knowledge… 

Figures and Tables from this paper



Editing Factual Knowledge in Language Models

This work presents KnowledgeEditor, a method which can be used to edit factual knowledge and, thus, fix ‘bugs’ or unexpected predictions without the need for expensive re-training or fine-tuning.

Time-Aware Language Models as Temporal Knowledge Bases

This work proposes a simple technique for jointly modeling text with its timestamp that improves memorization of seen facts from the training time period, as well as calibration on predictions about unseen facts from future time periods and shows that models trained with temporal context can be efficiently "refreshed" as new data arrives.

Language Models are Few-Shot Learners

GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.

K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters

K-Adapter is proposed, which remains the original parameters of the pre-trained model fixed and supports continual knowledge infusion and captures richer factual and commonsense knowledge than RoBERTa.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

This systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks and achieves state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more.

Language Models as Knowledge Bases?

An in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models finds that BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge.

LoRA: Low-Rank Adaptation of Large Language Models

Low-Rank Adaptation, or LoRA, is proposed, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks.

Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning

The experiments show that by just using an additional 2-3% parameters for each task, the model can maintain or even improve the performance of fine-tuning the whole model.

Language Models as Knowledge Bases: On Entity Representations, Storage Capacity, and Paraphrased Queries

Three entity representations that allow LMs to handle millions of entities are explored and a detailed case study on paraphrased querying of facts stored in LMs is presented, thereby providing a proof-of-concept that language models can indeed serve as knowledge bases.

Augmenting Self-attention with Persistent Memory

A new model that solely consists of attention layers is proposed that augment the self-attention layers with persistent memory vectors that play a similar role as the feed-forward layer.