Barack’s Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling

@inproceedings{LoganIV2019BaracksWH,
  title={Barack’s Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling},
  author={Robert L Logan IV and Nelson F. Liu and Matthew E. Peters and Matt Gardner and Sameer Singh},
  booktitle={ACL},
  year={2019}
}
Modeling human language requires the ability to not only generate fluent text but also encode factual knowledge. [...] Key Method These mechanisms enable the model to render information it has never seen before, as well as generate out-of-vocabulary tokens. We also introduce the Linked WikiText-2 dataset, a corpus of annotated text aligned to the Wikidata knowledge graph whose contents (roughly) match the popular WikiText-2 benchmark. In experiments, we demonstrate that the KGLM achieves significantly better…Expand
CoLAKE: Contextualized Language and Knowledge Embedding
TLDR
The Contextualized Language and Knowledge Embedding (CoLAKE) is proposed, which jointly learns contextualized representation for both language and knowledge with the extended MLM objective, and achieves surprisingly high performance on a synthetic task called word-knowledge graph completion, which shows the superiority of simultaneously contextualizing language andknowledge representation. Expand
PRETRAIN KNOWLEDGE-AWARE LANGUAGE MODELS
How much knowledge do pretrained language models hold? Recent research observed that pretrained transformers are adept at modeling semantics but it is unclear to what degree they grasp humanExpand
Facts as Experts: Adaptable and Interpretable Neural Memory over Symbolic Knowledge
TLDR
This work develops a neural language model that includes an explicit interface between symbolically interpretable factual information and subsymbolic neural knowledge and shows that this model dramatically improves performance on two knowledge-intensive question-answering tasks. Expand
Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model
TLDR
This work proposes a simple yet effective weakly supervised pretraining objective, which explicitly forces the model to incorporate knowledge about real-world entities, and consistently outperforms BERT on four entity-related question answering datasets. Expand
C L ] 4 F eb 2 02 1 Knowledge-Aware Language Model Pretraining
How much knowledge do pretrained language models hold? Recent research observed that pretrained transformers are adept at modeling semantics but it is unclear to what degree they grasp humanExpand
Controllable Story Generation with External Knowledge Using Large-Scale Language Models
TLDR
MEGATRON-CNTRL is a novel framework that uses large-scale language models and adds control to text generation by incorporating an external knowledge base and showcases the controllability of the model by replacing the keywords used to generate stories and re-running the generation process. Expand
Knowledge-Enhanced Natural Language Inference Based on Knowledge Graphs
TLDR
A novel Knowledge Graph-enhanced NLI (KGNLI) model is proposed to leverage the usage of background knowledge stored in knowledge graphs in the field of NLI and demonstrates the effectiveness of the model on four benchmarks. Expand
Inducing Relational Knowledge from BERT
TLDR
This work proposes a methodology for distilling relational knowledge from a pre-trained language model that fine-tune a language model to predict whether a given word pair is likely to be an instance of some relation, when given an instantiated template for that relation as input. Expand
Generating Factual Documents by Synthesizing Knowledge Sources
  • 2020
From youth, humans can read and process large amounts of information to write 1 articles, book reports, and conduct deep conversation. Existing large-scale language 2 models are yet incapable of suchExpand
Mining Knowledge for Natural Language Inference from Wikipedia Categories
TLDR
WikiNLI is introduced: a resource for improving model performance on NLI and LE tasks, and it is shown that it can improve strong baselines such as BERT and RoBERTa by pretraining them on WikiNLI and transferring the models on downstream tasks. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 30 REFERENCES
A Neural Knowledge Language Model
TLDR
A Neural Knowledge Language Model (NKLM) which combines symbolic knowledge provided by a knowledge graph with the RNN language model, and shows that the NKLM significantly improves the perplexity while generating a much smaller number of unknown words. Expand
Do Language Models Have Common Sense
It has been argued that current machine learning models do not have common sense, and therefore must be hard-coded with prior knowledge (Marcus, 2018). Here we show surprising evidence that languageExpand
Language Models are Unsupervised Multitask Learners
TLDR
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations. Expand
Reference-Aware Language Models
TLDR
Experiments on three representative applications show the coreference model variants outperform models based on deterministic attention and standard language modeling baselines. Expand
A Neural Conversational Model
TLDR
A simple approach to conversational modeling which uses the recently proposed sequence to sequence framework, and is able to extract knowledge from both a domain specific dataset, and from a large, noisy, and general domain dataset of movie subtitles. Expand
Breaking the Softmax Bottleneck: A High-Rank RNN Language Model
TLDR
It is shown that the expressiveness of Softmax-based models (including the majority of neural language models) is limited by a Softmax bottleneck, and a simple and effective method is proposed to address this issue. Expand
Pointer Sentinel Mixture Models
TLDR
The pointer sentinel-LSTM model achieves state of the art language modeling performance on the Penn Treebank while using far fewer parameters than a standard softmax LSTM and the freely available WikiText corpus is introduced. Expand
Regularizing and Optimizing LSTM Language Models
TLDR
This paper proposes the weight-dropped LSTM which uses DropConnect on hidden-to-hidden weights as a form of recurrent regularization and introduces NT-ASGD, a variant of the averaged stochastic gradient method, wherein the averaging trigger is determined using a non-monotonic condition as opposed to being tuned by the user. Expand
Entity Linking via Joint Encoding of Types, Descriptions, and Context
TLDR
This work presents a neural, modular entity linking system that learns a unified dense representation for each entity using multiple sources of information, such as its description, contexts around its mentions, and its fine-grained types. Expand
Neural Text Generation from Structured Data with Application to the Biography Domain
TLDR
A neural model for concept-to-text generation that scales to large, rich domains and significantly out-performs a classical Kneser-Ney language model adapted to this task by nearly 15 BLEU is introduced. Expand
...
1
2
3
...