Prompting Contrastive Explanations for Commonsense Reasoning Tasks

  title={Prompting Contrastive Explanations for Commonsense Reasoning Tasks},
  author={Bhargavi Paranjape and Julian Michael and Marjan Ghazvininejad and Luke Zettlemoyer and Hannaneh Hajishirzi},
Many commonsense reasoning NLP tasks involve choosing between one or more possible answers to a question or prompt based on knowledge that is often implicit. Large pretrained language models (PLMs) can achieve near-human performance on such tasks, while providing little human-interpretable evidence of the underlying reasoning they use. In this work, we show how to use these same models to generate such evidence: inspired by the contrastive nature of human explanations, we use PLMs to complete… 

Commonsense Reasoning for Question Answering with Explanations

A latent-variable model is proposed that identifies what type of knowledge from an external knowledge base may be relevant to answering the question, com-putes the commonsense inferences, and predicts the answer, and can learn to provide posterior rationales for why a certain answer was chosen.

Generated Knowledge Prompting for Commonsense Reasoning

Generated knowledge prompting develops generated knowledge prompting, which consists of generating knowledge from a language model, then providing the knowledge as additional input when answering a question, and improves performance of large-scale, state-of-the-art models on four commonsense reasoning tasks.

Unsupervised Explanation Generation via Correct Instantiations

N EON is proposed, a two-phrase, unsupervised explanation generation framework that generates corrected instantia- tions of the statement, then uses them to prompt large PLMs to complete the explanation and demonstrate that N EON remains effective when generalizing to different scenarios.

Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering

R AINIER, or Reinforced Knowledge Introspector, is presented, which is the first to report that knowledge generated by models that are orders of magnitude smaller than GPT-3, even without direct supervision on the knowledge itself, can exceed the quality of commonsense knowledge elicited from G PT-3.

Reframing Human-AI Collaboration for Generating Free-Text Explanations

This work creates a pipeline that combines GPT-3 with a supervised filter that incorporates binary acceptability judgments from humans in the loop and demonstrates that acceptability is partially correlated with various fine-grained attributes of explanations.

Iteratively Prompt Pre-trained Language Models for Chain of Thought

An iterative prompting framework, a new prompting paradigm which progressively elicits relevant knowledge from PLMs for multi-step inference, and an iterative context-aware prompter, which addresses limitations of existing prompting methods by learning to dynamically synthesize prompts conditioned on the current step’s contexts.

Shepherd Pre-trained Language Models to Develop a Train of Thought: An Iterative Prompting Approach

An iterative prompting framework, a new prompting paradigm which progressively elicits relevant knowledge from PLMs for multi-step inference tasks, and proposes an iterative context-aware prompter, which addresses limitations by learning to dynamically synthesize prompts conditioned on the current step’s contexts.

Elaboration-Generating Commonsense Question Answering at Scale

This work uses smaller language models to generate useful intermediate context, referred to here as elaborations, and alternates between updating two language models—an elaboration generator and an answer predictor—allowing each to influence the other.

Constructing Natural Language Explanations via Saliency Map Verbalization

The results suggest that saliency map verbalization makes explanations more under-standable and less cognitively challenging to humans than conventional heatmap visualization.

CCPrompt: Counterfactual Contrastive Prompt-Tuning for Many-Class Classification

The C ounterfactual C ontrastive Prompt Tuning (CCPrompt) approach for many-class classification, e.g., relation classification, topic classification, and entity typing, is proposed and it is indicated that the model outperforms former baselines.



Explain Yourself! Leveraging Language Models for Commonsense Reasoning

This work collects human explanations for commonsense reasoning in the form of natural language sequences and highlighted annotations in a new dataset called Common Sense Explanations to train language models to automatically generate explanations that can be used during training and inference in a novel Commonsense Auto-Generated Explanation framework.

PIQA: Reasoning about Physical Commonsense in Natural Language

The task of physical commonsense reasoning and a corresponding benchmark dataset Physical Interaction: Question Answering or PIQA are introduced and analysis about the dimensions of knowledge that existing models lack are provided, which offers significant opportunities for future research.

CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning

A constrained text generation task, CommonGen associated with a benchmark dataset, to explicitly test machines for the ability of generative commonsense reasoning, and demonstrates that the learned generative Commonsense reasoning capability can be transferred to improve downstream tasks such as CommonsenseQA by generating additional context.

A Simple Method for Commonsense Reasoning

Key to this method is the use of language models, trained on a massive amount of unlabled data, to score multiple choice questions posed by commonsense reasoning tests, which outperform previous state-of-the-art methods by a large margin.

Social IQA: Commonsense Reasoning about Social Interactions

It is established that Social IQa, the first large-scale benchmark for commonsense reasoning about social situations, is challenging for existing question-answering models based on pretrained language models, compared to human performance (>20% gap).

Measuring Association Between Labels and Free-Text Rationales

It is demonstrated that *pipelines*, models for faithful rationalization on information-extraction style tasks, do not work as well on “reasoning” tasks requiring free-text rationales, and state-of-the-art T5-based joint models exhibit desirable properties for explaining commonsense question-answering and natural language inference.

Unsupervised Commonsense Question Answering with Self-Talk

An unsupervised framework based on self-talk as a novel alternative to multiple-choice commonsense tasks, inspired by inquiry-based discovery learning, which improves performance on several benchmarks and competes with models that obtain knowledge from external KBs.

Explaining Question Answering Models through Text Generation

A model for multi-choice question answering, where a LM-based generator generates a textual hypothesis that is later used by a classifier to answer the question, and produces hypotheses that elucidate the knowledge used by the LM for answering the question.

e-SNLI: Natural Language Inference with Natural Language Explanations

The Stanford Natural Language Inference dataset is extended with an additional layer of human-annotated natural language explanations of the entailment relations, which can be used for various goals, such as obtaining full sentence justifications of a model’s decisions, improving universal sentence representations and transferring to out-of-domain NLI datasets.

CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge

This work presents CommonsenseQA: a challenging new dataset for commonsense question answering, which extracts from ConceptNet multiple target concepts that have the same semantic relation to a single source concept.