• Corpus ID: 239015890

Knowledge Enhanced Pretrained Language Models: A Compreshensive Survey

@article{Wei2021KnowledgeEP,
  title={Knowledge Enhanced Pretrained Language Models: A Compreshensive Survey},
  author={Xiaokai Wei and Shen Wang and Dejiao Zhang and Parminder Bhatia and Andrew O. Arnold},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.08455}
}
Pretrained Language Models (PLM) have established a new paradigm through learning informative contextualized representations on large-scale text corpus. This new paradigm has revolutionized the entire field of natural language processing, and set the new state-of-the-art performance for a wide variety of NLP tasks. However, though PLMs could store certain knowledge/facts from training corpus, their knowledge awareness is still far from satisfactory. To address this issue, integrating knowledgeโ€ฆย 

Figures and Tables from this paper

References

SHOWING 1-10 OF 111 REFERENCES
Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model
TLDR
This work proposes a simple yet effective weakly supervised pretraining objective, which explicitly forces the model to incorporate knowledge about real-world entities, and consistently outperforms BERT on four entity-related question answering datasets.
CoLAKE: Contextualized Language and Knowledge Embedding
TLDR
The Contextualized Language and Knowledge Embedding (CoLAKE) is proposed, which jointly learns contextualized representation for both language and knowledge with the extended MLM objective, and achieves surprisingly high performance on a synthetic task called word-knowledge graph completion, which shows the superiority of simultaneously contextualizing language andknowledge representation.
Language Models as Knowledge Bases?
TLDR
An in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models finds that BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge.
ERNIE: Enhanced Language Representation with Informative Entities
TLDR
This paper utilizes both large-scale textual corpora and KGs to train an enhanced language representation model (ERNIE) which can take full advantage of lexical, syntactic, and knowledge information simultaneously, and is comparable with the state-of-the-art model BERT on other common NLP tasks.
ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning
TLDR
A novel contrastive learning framework named ERICA is proposed in pre-training phase to obtain a deeper understanding of the entities and their relations in text and achieves consistent improvements on several documentlevel language understanding tasks, including relation extraction and reading comprehension, especially under low resource setting.
Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity
TLDR
The experiments suggest that the standard BERT (LIBERT), specialized for the word-level semantic similarity, yields better performance than the lexically blind โ€œvanillaโ€ BERT on several language understanding tasks, and shows consistent gains on 3 benchmarks for lexical simplification.
KgPLM: Knowledge-guided Language Model Pre-training via Generative and Discriminative Learning
TLDR
This work presents a language model pre-training framework guided by factual knowledge completion and verification, and uses the generative and discriminative approaches cooperatively to learn the model.
Enhancing Pre-Trained Language Representations with Rich Knowledge for Machine Reading Comprehension
TLDR
This work introduces KT-NET, which employs an attention mechanism to adaptively select desired knowledge from KBs, and then fuses selected knowledge with BERT to enable context- and knowledge-aware predictions.
Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models
TLDR
Experimental results demonstrate that pre-training models using the proposed approach followed by fine-tuning achieve significant improvements over previous state-of-the-art models on two commonsense-related benchmarks, including CommonsenseQA and Winograd Schema Challenge.
CoCoLM: COmplex COmmonsense Enhanced Language Model
TLDR
Through the careful training over a large-scale eventuality knowledge graphs ASER, the proposed general language model CoCoLM successfully teaches pre-trained language models (i.e., BERT and RoBERTa) rich complex commonsense knowledge among eventualities.
...
1
2
3
4
5
...