Learning to Rationalize for Nonmonotonic Reasoning with Distant Supervision

  title={Learning to Rationalize for Nonmonotonic Reasoning with Distant Supervision},
  author={Faeze Brahman and Vered Shwartz and Rachel Rudinger and Yejin Choi},
The black-box nature of neural models has motivated a line of research that aims to generate natural language rationales to explain why a model made certain predictions. Such rationale generation models, to date, have been trained on dataset-specific crowdsourced rationales, but this approach is costly and is not generalizable to new tasks and domains. In this paper, we investigate the extent to which neural models can reason about natural language rationales that explain model predictions… 

Figures and Tables from this paper

ExplaGraphs: An Explanation Graph Generation Task for Structured Commonsense Reasoning

This work presents ExplaGraphs, a new generative and structured commonsense-reasoning task (and an associated dataset) of explanation graph generation for stance prediction, and proposes a multi-level evaluation framework that check for the structural and semantic correctness of the generated graphs and their degree of match with ground-truth graphs.

Knowledge-Grounded Self-Rationalization via Extractive and Natural Language Explanations

This work introduces RE X C, a self-rationalizing framework that grounds its predictions and two complementary types of explanations (NLEs and extractive rationales) in background knowledge, and improves over previous methods by reaching SOTA task performance while also providing explanations.

Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations

This work de-velops M AIEUTIC PROMPTING, which infers a correct answer to a question even from the noisy and inconsistent generations of LM, and improves robustness in inference while providing interpretable rationales.

Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning Performance of GPT-2

Thinking aloud is an effective meta-cognitive strategy human reasoners apply to solve difficult problems. We suggest to improve the reasoning ability of pre-trained neural language models in a

Reframing Human-AI Collaboration for Generating Free-Text Explanations

This work creates a pipeline that combines GPT-3 with a supervised filter that incorporates binary acceptability judgments from humans in the loop and demonstrates that acceptability is partially correlated with various fine-grained attributes of explanations.

Few-Shot Self-Rationalization with Natural Language Prompts

This work identifies the right prompting approach by extensively exploring natural language prompts on FEB and demonstrates that making progress on few-shot self-rationalization is possible, and presents FEB—a stan-dardized collection of four existing English-language datasets and associated metrics.

Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing

This review identifies 61 datasets with three predominant classes of textual expla6 nations (highlights, free-text, and structured), organize the literature on annotating each type, identify strengths and shortcomings of existing collection methodologies, and give recommendations for collecting EXNLP datasets in the future.

On the Diversity and Limits of Human Explanations

Inspired by prior work in psychology and cognitive sciences, existing human explanations in NLP are group into three categories: proximal mechanism, evidence, and procedure, which differ in nature and have implications for the resultant explanations.

Teach Me to Explain: A Review of Datasets for Explainable NLP

This review identifies three predominant classes of explanations (highlights, free-text, and structured), organize the literature on annotating each type, point to what has been learned to date, and give recommendations for collecting EXNLP datasets in the future.

PInKS: Preconditioned Commonsense Inference with Minimal Supervision

It is shown that PInKS improves the results on benchmarks focused on reasoning with the preconditions of commonsense knowledge (up to 40% Macro-F1 scores), and is investigated through PAC-Bayesian informativeness analysis, precision measures and ablation study.



Explain Yourself! Leveraging Language Models for Commonsense Reasoning

This work collects human explanations for commonsense reasoning in the form of natural language sequences and highlighted annotations in a new dataset called Common Sense Explanations to train language models to automatically generate explanations that can be used during training and inference in a novel Commonsense Auto-Generated Explanation framework.

Learning to Faithfully Rationalize by Construction

Variations of this simple framework yield predictive performance superior to ‘end-to-end’ approaches, while being more general and easier to train.

e-SNLI: Natural Language Inference with Natural Language Explanations

The Stanford Natural Language Inference dataset is extended with an additional layer of human-annotated natural language explanations of the entailment relations, which can be used for various goals, such as obtaining full sentence justifications of a model’s decisions, improving universal sentence representations and transferring to out-of-domain NLI datasets.

NILE : Natural Language Inference with Faithful Natural Language Explanations

This work proposes Natural-language Inference over Label-specific Explanations (NILE), a novel NLI method which utilizes auto-generated label-specific NL explanations to produce labels along with its faithful explanation and demonstrates NILE’s effectiveness over previously reported methods through automated and human evaluation of the produced labels and explanations.

Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning

This paper introduces a new scoring method that casts a plausibility ranking task in a full-text format and leverages the masked language modeling head tuned during the pre-training phase and requires less annotated data than the standard classifier approach to reach equivalent performances.

Thinking Like a Skeptic: Defeasible Inference in Natural Language

From Defeasible NLI, both a classification and generation task for defeasible inference are developed, and it is demonstrated that the generation task is much more challenging.

Abductive Commonsense Reasoning

This study introduces a challenge dataset, ART, that consists of over 20k commonsense narrative contexts and 200k explanations, and conceptualizes two new tasks -- Abductive NLI: a multiple-choice question answering task for choosing the more likely explanation, and Abduction NLG: a conditional generation task for explaining given observations in natural language.

Explaining Question Answering Models through Text Generation

A model for multi-choice question answering, where a LM-based generator generates a textual hypothesis that is later used by a classifier to answer the question, and produces hypotheses that elucidate the knowledge used by the LM for answering the question.

Interpretable Neural Predictions with Differentiable Binary Variables

This work proposes a latent model that mixes discrete and continuous behaviour allowing at the same time for binary selections and gradient-based training without REINFORCE, and can tractably compute the expected value of penalties such as L0, which allows it to directly optimise the model towards a pre-specified text selection rate.

Commonsense Knowledge Mining from Pretrained Models

This work develops a method for generating commonsense knowledge using a large, pre-trained bidirectional language model that can be used to rank a triple’s validity by the estimated pointwise mutual information between the two entities.