Thinking Like a Skeptic: Defeasible Inference in Natural Language

@inproceedings{Rudinger2020ThinkingLA,
  title={Thinking Like a Skeptic: Defeasible Inference in Natural Language},
  author={Rachel Rudinger and Vered Shwartz and Jena D. Hwang and Chandra Bhagavatula and Maxwell Forbes and Ronan Le Bras and Noah A. Smith and Yejin Choi},
  booktitle={FINDINGS},
  year={2020}
}
Defeasible inference is a mode of reasoning in which an inference (X is a bird, therefore X flies) may be weakened or overturned in light of new evidence (X is a penguin). Though long recognized in classical AI and philosophy, defeasible inference has not been extensively studied in the context of contemporary data-driven research on natural language inference and commonsense reasoning. We introduce Defeasible NLI (abbreviated \delta-NLI), a dataset for defeasible inference in natural language… 

Figures and Tables from this paper

Learning to Rationalize for Nonmonotonic Reasoning with Distant Supervision

TLDR
This paper investigates multiple ways to automatically generate rationales using pre-trained language models, neural knowledge models, and distant supervision from related tasks, and trains generative models capable of composing explanatory rationales for unseen instances.

Could you give me a hint ? Generating inference graphs for defeasible reasoning

TLDR
This paper automatically generates meaningful graphs for the defeasible inference task through transfer learning from a related NLP task that shares the kind of reasoning that inference graphs support.

Think about it! Improving defeasible reasoning by first modeling the question scenario.

TLDR
The CURIOUS system achieves a new state-of-the-art on three different defeasible reasoning datasets, illustrating that performance can be improved by guiding a system to “think about” a question and explicitly model the scenario, rather than answering reflexively.

AnaLog: Testing Analytical and Deductive Logic Learnability in Language Models

TLDR
The best performing language model is closely analysed and it is shown that while it performs more consistently than other language models across logical connectives and reasoning domains, it still is sensitive to lexical and syntactic variations in the realisation of logical statements.

Epistemic closure filters for natural language inference

Epistemic closure refers to the assumption that humans are able to recognize what entails or contradicts what they believe and know, or more accurately, that humans’ epistemic states are closed under

The Curious Case of Commonsense Intelligence

Abstract Commonsense intelligence is a long-standing puzzle in AI. Despite considerable advances in deep learning, AI continues to be narrow and brittle due to its lack of common sense. Why is common

Penguins Don't Fly: Reasoning about Generics through Instantiations and Exceptions

TLDR
A novel framework to generate pragmatically relevant true and false instances of a generic, which outperforms few-shot generation from GPT-3 and high-lights the importance of constrained decoding for this task and the implications of generics EXEMPLARS for language inference tasks.

“I’m Not Mad”: Commonsense Implications of Negation and Contradiction

TLDR
This paper introduces ANION, a new commonsense knowledge graph with 624K if-then rules focusing on negated and contradictory events, and presents joint generative and discriminative inference models for this new resource, providing novel empirical insights on how logical negations and commonsense contradictions reshape the commonsense implications of their original premises.

Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning Performance of GPT-2

Thinking aloud is an effective meta-cognitive strategy human reasoners apply to solve difficult problems. We suggest to improve the reasoning ability of pre-trained neural language models in a

PaCo: Preconditions Attributed to Commonsense Knowledge

TLDR
A novel challenge of reasoning with circumstantial preconditions of commonsense statements expressed in natural language and results re-veal a 10 - 30% gap between machine and human performance on the authors' tasks, which shows that reasoning with precondition is an open challenge.

References

SHOWING 1-10 OF 54 REFERENCES

Natural language inference

TLDR
This dissertation explores a range of approaches to NLI, beginning with methods which are robust but approximate, and proceeding to progressively more precise approaches, and greatly extends past work in natural logic to incorporate both semantic exclusion and implicativity.

Abductive Commonsense Reasoning

TLDR
This study introduces a challenge dataset, ART, that consists of over 20k commonsense narrative contexts and 200k explanations, and conceptualizes two new tasks -- Abductive NLI: a multiple-choice question answering task for choosing the more likely explanation, and Abduction NLG: a conditional generation task for explaining given observations in natural language.

A large annotated corpus for learning natural language inference

TLDR
The Stanford Natural Language Inference corpus is introduced, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning, which allows a neural network-based model to perform competitively on natural language inference benchmarks for the first time.

Annotation Artifacts in Natural Language Inference Data

TLDR
It is shown that a simple text categorization model can correctly classify the hypothesis alone in about 67% of SNLI and 53% of MultiNLI, and that specific linguistic phenomena such as negation and vagueness are highly correlated with certain inference classes.

ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning

TLDR
Experimental results demonstrate that multitask models that incorporate the hierarchical structure of if-then relation types lead to more accurate inference compared to models trained in isolation, as measured by both automatic and human evaluation.

Ordinal Common-sense Inference

TLDR
This work describes a framework for extracting common-sense knowledge from corpora, which is then used to construct a dataset for this ordinal entailment task, and annotates subsets of previously established datasets via the ordinal annotation protocol in order to analyze the distinctions between these and what is constructed.

Defeasible Reasoning

TLDR
A general theory of warrant, based on defeasible reasons, is developed and used as guide in the construction of theory of defeosible reasoning, and a computer program implementing that theory is developed.

Uncertain Natural Language Inference

TLDR
The feasibility of collecting annotations for UNLI is demonstrated by relabeling a portion of the SNLI dataset under a probabilistic scale, where items even with the same categorical label differ in how likely people judge them to be true given a premise.

Natural language inference.

The paper describes the way in which a Preference Semantics system for natural language analysis and generation tackles a difficult class of anaphoric inference problems (finding the correct referent

SOME PHILOSOPHICAL PROBLEMS FROM THE STANDPOINT OF ARTI CIAL INTELLIGENCE

...