• Corpus ID: 233296761

Knowledge Neurons in Pretrained Transformers

@inproceedings{Dai2022KnowledgeNI,
  title={Knowledge Neurons in Pretrained Transformers},
  author={Damai Dai and Li Dong and Yaru Hao and Zhifang Sui and Furu Wei},
  booktitle={ACL},
  year={2022}
}
Large-scale pretrained language models are surprisingly good at recalling factual knowledge presented in the training corpus. In this paper, we present preliminary studies on how factual knowledge is stored in pretrained Transformers by introducing the concept of knowledge neurons. Specifically, we examine the fill-in-the-blank cloze task for BERT. Given a relational fact, we propose a knowledge attribution method to identify the neurons that express the fact. We find that the activation of… 

Figures and Tables from this paper

Finding patterns in Knowledge Attribution for Transformers
TLDR
It is found that grammatical knowledge is far more dispersed among the neurons than factual knowledge, and can be attributed to middle and higher layers of the network.
Locating and Editing Factual Knowledge in GPT
TLDR
Using COUNTERFACT, the distinction between saying and knowing neurons is confirmed, and the development of ROME is found, a novel method for editing facts stored in model weights that achieves state-of-the-art performance in knowledge editing compared to other methods.
BERTnesia: Investigating the capture and forgetting of knowledge in BERT
TLDR
This paper utilizes knowledge base completion tasks to probe every layer of pre-trained as well as fine-tuned BERT (ranking, question answering, NER) and finds that ranking models forget the least and retain more knowledge in their final layer.
Distilling Relation Embeddings from Pretrained Language Models
TLDR
The resulting relation embeddings are highly competitive on analogy (unsupervised) and relation classification ( supervised) benchmarks, even without any task-specific fine-tuning.
Analyzing Commonsense Emergence in Few-shot Knowledge Models
TLDR
The results show that commonsense knowledge models can rapidly adapt from limited examples, indicating that KG fine-tuning serves to learn an interface to encoded knowledge learned during pretraining.
Emergent Structures and Training Dynamics in Large Language Models
TLDR
The lack of sufficient search on the emergence of functional units and subsections of the network within large language models will motivate future work that grounds the study of language models in an analysis of their changing internal structure during training time.
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space
TLDR
This work reverse-engineering the operation of the feed-forward network layers, one of the building blocks of transformer models, shows that each update can be decomposed to sub-updates corresponding to single FFN parameter vectors, each promoting concepts that are often human-interpretable.
A Survey of Knowledge-Intensive NLP with Pre-Trained Language Models
TLDR
This paper aims to summarize the current progress of pre-trained language modelbased knowledge-enhanced models (PLMKEs) by dissecting their three vital elements: knowledge sources, knowledge-intensive NLP tasks, and knowledge fusion methods.
Sparse Interventions in Language Models with Differentiable Masking
TLDR
Inspired by causal mediation analysis, this work proposes a method that discovers within a neural LM a small subset of neurons responsible for a particular linguistic phenomenon, i.e., subsets causing a change in the corresponding token emission probabilities.
Evaluating Inexact Unlearning Requires Revisiting Forgetting
TLDR
It is empirically show that two simple un learning methods, exact-unlearning and catastrophic-forgetting the final k layers of a network, outperforms prior unlearning methods when scaled to large deletion sets.
...
1
2
3
...

References

SHOWING 1-10 OF 39 REFERENCES
Language Models as Knowledge Bases?
TLDR
An in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models finds that BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge.
Analyzing the Structure of Attention in a Transformer Language Model
TLDR
It is found that attention targets different parts of speech at different layer depths within the model, and that attention aligns with dependency relations most strongly in the middle layers, and the deepest layers of the model capture the most distant relationships.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TLDR
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
BERT Rediscovers the Classical NLP Pipeline
TLDR
This work finds that the model represents the steps of the traditional NLP pipeline in an interpretable and localizable way, and that the regions responsible for each step appear in the expected sequence: POS tagging, parsing, NER, semantic roles, then coreference.
What Does BERT Look at? An Analysis of BERT’s Attention
TLDR
It is shown that certain attention heads correspond well to linguistic notions of syntax and coreference, and an attention-based probing classifier is proposed and used to demonstrate that substantial syntactic information is captured in BERT’s attention.
COMET: Commonsense Transformers for Automatic Knowledge Graph Construction
TLDR
This investigation reveals promising results when implicit knowledge from deep pre-trained language models is transferred to generate explicit knowledge in commonsense knowledge graphs, and suggests that using generative commonsense models for automatic commonsense KB completion could soon be a plausible alternative to extractive methods.
Axiomatic Attribution for Deep Networks
We study the problem of attributing the prediction of a deep network to its input features, a problem previously studied by several other works. We identify two fundamental axioms— Sensitivity and
Unified Language Model Pre-training for Natural Language Understanding and Generation
TLDR
A new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks that compares favorably with BERT on the GLUE benchmark, and the SQuAD 2.0 and CoQA question answering tasks.
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training
TLDR
The experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of natural language understanding and generation tasks across several widely used benchmarks.
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
TLDR
The contextual representations learned by the proposed replaced token detection pre-training task substantially outperform the ones learned by methods such as BERT and XLNet given the same model size, data, and compute.
...
1
2
3
4
...