• Corpus ID: 219573621

Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge

@article{Talmor2020TeachingPM,
  title={Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge},
  author={Alon Talmor and Oyvind Tafjord and Peter Clark and Yoav Goldberg and Jonathan Berant},
  journal={ArXiv},
  year={2020},
  volume={abs/2006.06609}
}
To what extent can a neural network systematically reason over symbolic facts? Evidence suggests that large pre-trained language models (LMs) acquire some reasoning capacity, but this ability is difficult to control. Recently, it has been shown that Transformer-based models succeed in consistent reasoning over explicit symbolic facts, under a "closed-world" assumption. However, in an open-domain setup, it is desirable to tap into the vast reservoir of implicit knowledge already encoded in the… 

Figures and Tables from this paper

Knowledge-driven Self-supervision for Zero-shot Commonsense Question Answering
TLDR
A novel neuro-symbolic framework for zero-shot question answering across commonsense tasks is proposed and it is shown that, while an individual knowledge graph is better suited for specific tasks, a global knowledge graph brings consistent gains across different tasks.
Flexible Operations for Natural Language Deduction
TLDR
This paper uses a BART-based model to generate the result of applying a particular logical operation to one or more premise statements, and has a largely automated pipeline for scraping and constructing suitable training examples from Wikipedia, which are then paraphrased to give the models the ability to handle lexical variation.
When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data
TLDR
This paper provides a formal framework for characterizing approaches to learning from explanation data, and proposes a synthetic task for studying how models learn from explanationData, and gives graphical models for the available modeling approaches.
Improving Neural Model Performance through Natural Language Feedback on Their Explanations
TLDR
This work introduces MERCURIE, an interactive system that refines its explanations for a given reasoning task by getting human feedback in natural language, and generates graphs that have 40% fewer inconsistencies as compared with the off-the-shelf system.
Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning Performance of GPT-2
Thinking aloud is an effective meta-cognitive strategy human reasoners apply to solve difficult problems. We suggest to improve the reasoning ability of pre-trained neural language models in a
Concepts, Properties and an Approach for Compositional Generalization
TLDR
This work connects a series of work for compositional generalization, and summarizes an approach that uses architecture design and regularization to regulate information of representations to lead to advance artificial intelligence.
Inducing Taxonomic Knowledge from Pretrained Transformers
TLDR
A method for inducing taxonomic trees from pretrained transformers by assigning a score for the likelihood that each pair of terms forms a parent-child relation and producing the maximum spanning tree.
Overcoming Poor Word Embeddings with Word Definitions
TLDR
This work shows that examples that depend critically on a rarer word are more challenging for natural language inference models, and explores how a model could learn to use definitions, provided in natural text, to overcome this handicap.
Representing Numbers in NLP: a Survey and a Vision
TLDR
This work synthesizes best practices for representing numbers in text and articulate a vision for holistic numeracy in NLP, comprised of design trade-offs and a unified evaluation.
Shepherd Pre-trained Language Models to Develop a Train of Thought: An Iterative Prompting Approach
TLDR
An iterative prompting framework, a new prompting paradigm which progressively elicits relevant knowledge from PLMs for multi-step inference tasks, and proposes an iterative context-aware prompter, which addresses limitations by learning to dynamically synthesize prompts conditioned on the current step's contexts.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 50 REFERENCES
Analysing Mathematical Reasoning Abilities of Neural Models
TLDR
This paper conducts a comprehensive analysis of models from two broad classes of the most powerful sequence-to-sequence architectures and finds notable differences in their ability to resolve mathematical problems and generalize their knowledge.
How Context Affects Language Models' Factual Predictions
TLDR
This paper reports that augmenting pre-trained language models in this way dramatically improves performance and that the resulting system, despite being unsupervised, is competitive with a supervised machine reading baseline.
Language Models as Knowledge Bases?
TLDR
An in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models finds that BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge.
Enhanced LSTM for Natural Language Inference
TLDR
This paper presents a new state-of-the-art result, achieving the accuracy of 88.6% on the Stanford Natural Language Inference Dataset, and demonstrates that carefully designing sequential inference models based on chain LSTMs can outperform all previous models.
Compositional Generalization for Primitive Substitutions
TLDR
This paper conducts fundamental research for encoding compositionality in neural networks with two representations, one generating attention maps, and the other mapping attended input words to output symbols to improve generalization.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TLDR
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
A large annotated corpus for learning natural language inference
TLDR
The Stanford Natural Language Inference corpus is introduced, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning, which allows a neural network-based model to perform competitively on natural language inference benchmarks for the first time.
Never-Ending Learning
TLDR
The Never-Ending Language Learner is described, which achieves some of the desired properties of a never-ending learner, and lessons learned are discussed.
Improving Language Understanding by Generative Pre-Training
TLDR
The general task-agnostic model outperforms discriminatively trained models that use architectures specifically crafted for each task, improving upon the state of the art in 9 out of the 12 tasks studied.
Annotation Artifacts in Natural Language Inference Data
TLDR
It is shown that a simple text categorization model can correctly classify the hypothesis alone in about 67% of SNLI and 53% of MultiNLI, and that specific linguistic phenomena such as negation and vagueness are highly correlated with certain inference classes.
...
1
2
3
4
5
...