CoLAKE: Contextualized Language and Knowledge Embedding

@inproceedings{Sun2020CoLAKECL,
  title={CoLAKE: Contextualized Language and Knowledge Embedding},
  author={Tianxiang Sun and Yunfan Shao and Xipeng Qiu and Qipeng Guo and Yaru Hu and Xuanjing Huang and Zheng Zhang},
  booktitle={COLING},
  year={2020}
}
With the emerging branch of incorporating factual knowledge into pre-trained language models such as BERT, most existing models consider shallow, static, and separately pre-trained entity embeddings, which limits the performance gains of these models. Few works explore the potential of deep contextualized knowledge representation when injecting knowledge. In this paper, we propose the Contextualized Language and Knowledge Embedding (CoLAKE), which jointly learns contextualized representation… Expand

Figures and Tables from this paper

Enhancing Language Models with Plug-and-Play Large-Scale Commonsense
  • Wanyun Cui, Xingran Chen
  • Computer Science
  • ArXiv
  • 2021
TLDR
A plug-and-play method for largescale commonsense integration without further pre-training is proposed, inspired by the observation that when finetuning LMs for downstream tasks without external knowledge, the variation in the parameter space was minor. Expand
KELM: Knowledge Enhanced Pre-Trained Language Representations with Message Passing on Hierarchical Relational Graphs
  • Yinquan Lu, H. Lu, Guirong Fu, Qun Liu
  • Computer Science
  • ArXiv
  • 2021
TLDR
A novel knowledge-aware language model framework based on fine-tuning process, which equips PLM with a unified knowledge-enhanced text graph that contains both text and multi-relational sub-graphs extracted from KG, and design a hierarchical relational-graph-based message passing mechanism. Expand
SMedBERT: A Knowledge-Enhanced Pre-trained Language Model with Structured Semantics for Medical Text Mining
TLDR
In SMedBERT, a medical PLM trained on large-scale medical corpora, incorporating deep structured semantics knowledge from neighbours of linked-entity, the mention-neighbour hybrid attention is proposed to learn heterogeneousentity information, which infuses the semantic representations of entity types into the homogeneous neighbouring entity structure. Expand
ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning
TLDR
A novel contrastive learning framework named ERICA is proposed in pre-training phase to obtain a deeper understanding of the entities and their relations in text and achieves consistent improvements on several documentlevel language understanding tasks, including relation extraction and reading comprehension, especially under low resource setting. Expand
K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering
TLDR
K-AID is a systematic approach that includes a low-cost knowledge acquisition process for acquiring domain knowledge, an effective knowledge infusion module for improving model performance, and a knowledge distillation component for reducing model size and deploying K-PLMs on resource-restricted devices for real-world application. Expand
ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
TLDR
A unified framework named ERNIE 3.0 is proposed for pre-training large-scale knowledge enhanced models that fuses auto-regressive network and auto-encoding network, so that the trained model can be easily tailored for both natural language understanding and generation tasks with zero-shot learning, few- shot learning or fine-tuning. Expand
A Survey of Knowledge Enhanced Pre-trained Models
  • Jian Yang, Gang Xiao, +4 authors Jinghui Peng
  • Computer Science
  • ArXiv
  • 2021
TLDR
This survey provides a comprehensive overview of pre-trained models with knowledge injection, which possess deep understanding and logical reasoning and introduce interpretability to some extent, and some potential directions of KEPTMs for future research. Expand
OAG-BERT: Pre-train Heterogeneous Entity-augmented Academic Language Models
TLDR
To better endow OAG-BERT with the ability to capture entity information, novel pre-training strategies are developed including heterogeneous entity type embedding, entity-aware 2D positional encoding, and span-aware entity masking. Expand
Drop Redundant, Shrink Irrelevant: Selective Knowledge Injection for Language Pretraining
  • Ningyu Zhang, Shumin Deng, +4 authors Huajun Chen
  • Computer Science
  • IJCAI
  • 2021
TLDR
This study investigates the fundamental reasons for ineffective knowledge infusion and presents selective injection for language pretraining, which constitutes a modelagnostic method and is readily pluggable into previous approaches and can enhance state-of-the-art knowledge injection methods. Expand
NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task-Next Sentence Prediction
  • Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
  • Computer Science
  • ArXiv
  • 2021
TLDR
This paper attempts to accomplish several NLP tasks in the zero-shot scenario using a BERT original pre-training task abandoned by RoBERTa and other models—Next Sentence Prediction (NSP). Expand
...
1
2
...

References

SHOWING 1-10 OF 49 REFERENCES
K-BERT: Enabling Language Representation with Knowledge Graph
TLDR
This work proposes a knowledge-enabled language representation model (K-BERT) with knowledge graphs (KGs), in which triples are injected into the sentences as domain knowledge, which significantly outperforms BERT and demonstrates that K-berT is an excellent choice for solving the knowledge-driven problems that require experts. Expand
ERNIE: Enhanced Language Representation with Informative Entities
TLDR
This paper utilizes both large-scale textual corpora and KGs to train an enhanced language representation model (ERNIE) which can take full advantage of lexical, syntactic, and knowledge information simultaneously, and is comparable with the state-of-the-art model BERT on other common NLP tasks. Expand
Barack’s Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling
TLDR
This work introduces the knowledge graph language model (KGLM), a neural language model with mechanisms for selecting and copying facts from a knowledge graph that are relevant to the context that enable the model to render information it has never seen before, as well as generate out-of-vocabulary tokens. Expand
Knowledge Enhanced Contextual Word Representations
TLDR
After integrating WordNet and a subset of Wikipedia into BERT, the knowledge enhanced BERT (KnowBert) demonstrates improved perplexity, ability to recall facts as measured in a probing task and downstream performance on relationship extraction, entity typing, and word sense disambiguation. Expand
Language Models as Knowledge Bases?
TLDR
An in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models finds that BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge. Expand
CoKE: Contextualized Knowledge Graph Embedding
TLDR
Contextualized Knowledge Graph Embedding (CoKE) is presented, a novel paradigm that takes into account such contextual nature, and learns dynamic, flexible, and fully contextualized entity and relation embeddings. Expand
Representation Learning of Knowledge Graphs with Entity Descriptions
TLDR
Experimental results on real-world datasets show that, the proposed novel RL method for knowledge graphs outperforms other baselines on the two tasks, especially under the zero-shot setting, which indicates that the method is capable of building representations for novel entities according to their descriptions. Expand
Integrating Graph Contextualized Knowledge into Pre-trained Language Models
TLDR
Experimental results demonstrate that the proposed model achieves the state-of-the-art performance on several medical NLP tasks, and the improvement above MedERNIE indicates that graph contextualized knowledge is beneficial. Expand
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TLDR
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks. Expand
Embedding Entities and Relations for Learning and Inference in Knowledge Bases
TLDR
It is found that embeddings learned from the bilinear objective are particularly good at capturing relational semantics and that the composition of relations is characterized by matrix multiplication. Expand
...
1
2
3
4
5
...