The Effectiveness of Masked Language Modeling and Adapters for Factual Knowledge Injection
@article{Wold2022TheEO, title={The Effectiveness of Masked Language Modeling and Adapters for Factual Knowledge Injection}, author={Sondre Wold}, journal={ArXiv}, year={2022}, volume={abs/2210.00907} }
This paper studies the problem of injecting factual knowledge into large pre-trained language models. We train adapter modules on parts of the ConceptNet knowledge graph using the masked language modeling objective and evaluate the success of the method by a series of probing experiments on the LAMA probe. Mean P@K curves for different configurations indicate that the technique is effective, increasing the performance on sub-sets of the LAMA probe for large values of k by adding as little as 2…
References
SHOWING 1-10 OF 24 REFERENCES
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers
- Computer ScienceDEELIO
- 2020
A deeper analysis reveals that the adapter-based models substantially outperform BERT on inference tasks that require the type of conceptual knowledge explicitly present in ConceptNet and its corresponding Open Mind Common Sense corpus.
LM-CORE: Language Models with Contextually Relevant External Knowledge
- Computer ScienceNAACL-HLT
- 2022
Experimental results show that LM-CORE, having access to external knowledge, achieves signif-icant and robust outperformance over state-of-the-art knowledge-enhanced language models on knowledge probing tasks; can effectively handle knowledge updates; and performs well on two downstream tasks.
AdapterHub: A Framework for Adapting Transformers
- Computer ScienceEMNLP
- 2020
AdaptersHub is proposed, a framework that allows dynamic “stiching-in” of pre-trained adapters for different tasks and languages that enables scalable and easy access to sharing of task-specific models, particularly in low-resource scenarios.
K-BERT: Enabling Language Representation with Knowledge Graph
- Computer ScienceAAAI
- 2020
This work proposes a knowledge-enabled language representation model (K-BERT) with knowledge graphs (KGs), in which triples are injected into the sentences as domain knowledge, which significantly outperforms BERT and reveals promising results in twelve NLP tasks.
SMedBERT: A Knowledge-Enhanced Pre-trained Language Model with Structured Semantics for Medical Text Mining
- Computer ScienceACL
- 2021
In SMedBERT, a medical PLM trained on large-scale medical corpora, incorporating deep structured semantic knowledge from neighbours of linked-entity, the mention-neighbour hybrid attention is proposed to learn heterogeneous-entity information, which infuses the semantic representations of entity types into the homogeneous neighbouring entity structure.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- Computer ScienceNAACL
- 2019
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
ConceptNet 5.5: An Open Multilingual Graph of General Knowledge
- Computer ScienceAAAI
- 2017
A new version of the linked open data resource ConceptNet is presented that is particularly well suited to be used with modern NLP techniques such as word embeddings, with state-of-the-art results on intrinsic evaluations of word relatedness that translate into improvements on applications of word vectors, including solving SAT-style analogies.
QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering
- Computer ScienceNAACL
- 2021
This work proposes a new model, QA-GNN, which addresses the problem of answering questions using knowledge from pre-trained language models (LMs) and knowledge graphs (KGs) through two key innovations: relevance scoring and joint reasoning.
Knowledge Enhanced Contextual Word Representations
- Computer ScienceEMNLP
- 2019
After integrating WordNet and a subset of Wikipedia into BERT, the knowledge enhanced BERT (KnowBert) demonstrates improved perplexity, ability to recall facts as measured in a probing task and downstream performance on relationship extraction, entity typing, and word sense disambiguation.
RoBERTa: A Robustly Optimized BERT Pretraining Approach
- Computer ScienceArXiv
- 2019
It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.