• Corpus ID: 238253193

A Survey of Knowledge Enhanced Pre-trained Models

@article{Yang2021ASO,
  title={A Survey of Knowledge Enhanced Pre-trained Models},
  author={Jian Yang and Gang Xiao and Yulong Shen and Wei Jiang and Xinyu Hu and Ying Zhang and Jinghui Peng},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.00269}
}
—Pre-trained models learn informative representations on large-scale training data through a self-supervised or supervised learning method, which has achieved promising performance in natural language processing (NLP), computer vision (CV), and cross-modal fields after fine-tuning. These models, however, suffer from poor robustness and lack of interpretability. Pre-trained models with knowledge injection, which we call knowledge enhanced pre-trained models (KEPTMs), possess deep understanding and… 
Multimodal Classification of Safety-Report Observations
TLDR
A multimodal machine-learning architecture for the analysis and categorization of safety observations, given textual descriptions and images taken from the location sites, based on the joint fine tuning of large pretrained language and image neural network models is developed.
technical review on knowledge intensive NLP for pre-trained language development
TLDR
The present progress of pre-trained language model-based knowledge-enhanced models (PLMKEs) is described by deconstructing their three key elements: information sources, knowledge-intensive NLP tasks, and knowledge fusion methods.
Knowledge-Augmented Methods for Natural Language Processing
TLDR
This tutorial introduces the key steps in integrating knowledge into NLP, including knowledge grounding from text, knowledge representation and fusing, and introduces recent state-of-the-art applications in fusing knowledge into language understanding, language generation and commonsense reasoning.
KESA: A Knowledge Enhanced Approach For Sentiment Analysis
TLDR
This paper studies sentence-level sentiment analysis and proposes two sentiment-aware auxiliary tasks named sentiment word cloze and conditional sentiment prediction, and argues that more information can promote the models to learn more profound semantic representation.
A Survey of Knowledge-Intensive NLP with Pre-Trained Language Models
TLDR
This paper aims to summarize the current progress of pre-trained language modelbased knowledge-enhanced models (PLMKEs) by dissecting their three vital elements: knowledge sources, knowledge-intensive NLP tasks, and knowledge fusion methods.
What Can Knowledge Bring to Machine Learning?—A Survey of Low-shot Learning for Structured Data
TLDR
The fundamental factors of low-shot learning technologies are reviewed, with a focus on the operation of structured knowledge under different low- shot conditions, and the prospects and gaps of industrial applications and future research directions are pointed out.

References

SHOWING 1-10 OF 195 REFERENCES
SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis
TLDR
Sentiment Knowledge Enhanced Pre-training (SKEP) is introduced in order to learn a unified sentiment representation for multiple sentiment analysis tasks, and significantly outperforms strong pre-training baseline, and achieves new state-of-the-art results on most of the test datasets.
K-BERT: Enabling Language Representation with Knowledge Graph
TLDR
This work proposes a knowledge-enabled language representation model (K-BERT) with knowledge graphs (KGs), in which triples are injected into the sentences as domain knowledge, which significantly outperforms BERT and reveals promising results in twelve NLP tasks.
COMET: Commonsense Transformers for Automatic Knowledge Graph Construction
TLDR
This investigation reveals promising results when implicit knowledge from deep pre-trained language models is transferred to generate explicit knowledge in commonsense knowledge graphs, and suggests that using generative commonsense models for automatic commonsense KB completion could soon be a plausible alternative to extractive methods.
PTR: Prompt Tuning with Rules for Text Classification
TLDR
This work proposes prompt tuning with rules (PTR) for many-class text classification, and applies logic rules to construct prompts with several sub-prompts, which is able to encode prior knowledge of each class into prompt tuning.
ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning
TLDR
Experimental results demonstrate that ERICA can improve typical PLMs (BERT and RoBERTa) on several language understanding tasks, including relation extraction, entity typing and question answering, especially under low-resource settings.
SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge
TLDR
A novel language representation model called SentiLARE is proposed, which introduces word-level linguistic knowledge including part-of-speech tag and sentiment polarity (inferred from SentiWordNet) into pre-trained models.
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention
TLDR
New pretrained contextualized representations of words and entities based on the bidirectional transformer, and an entity-aware self-attention mechanism that considers the types of tokens (words or entities) when computing attention scores are proposed.
CoLAKE: Contextualized Language and Knowledge Embedding
TLDR
The Contextualized Language and Knowledge Embedding (CoLAKE) is proposed, which jointly learns contextualized representation for both language and knowledge with the extended MLM objective, and achieves surprisingly high performance on a synthetic task called word-knowledge graph completion, which shows the superiority of simultaneously contextualizing language andknowledge representation.
Do Syntax Trees Help Pre-trained Transformers Extract Information?
TLDR
This work systematically study the utility of incorporating dependency trees into pre-trained transformers on three representative information extraction tasks and proposes and investigates two distinct strategies for incorporating dependency structure: a late fusion approach, which applies a graph neural network on the output of a transformer, and a joint fusion approach that infuses syntax structure into the transformer attention layers.
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
TLDR
A general-purpose fine-tuning recipe for retrieval-augmented generation (RAG) -- models which combine pre-trained parametric and non-parametric memory for language generation, and finds that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
...
...