• Corpus ID: 231786356

When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data

@article{Hase2022WhenCM,
  title={When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data},
  author={Peter Hase and Mohit Bansal},
  journal={ArXiv},
  year={2022},
  volume={abs/2102.02201}
}
Many methods now exist for conditioning models on task instructions and user-provided explanations for individual data points. These methods show great promise for improving task performance of language models beyond what can be achieved by learning from individual (x,y) pairs. In this paper, we (1) provide a formal framework for characterizing approaches to learning from explanation data, and (2) we propose a synthetic task for studying how models learn from explanation data. In the first… 
A survey on improving NLP models with human explanations
TLDR
An overview of different methods for learning from human explanations is given, and different factors that can inform the decision of which method to choose for a specific use-case are discussed.
Tell me why! - Explanations support learning of relational and causal structure
TLDR
It is shown that explanations help overcome the tendency of agents to fixate on simple features, and which aspects of explanations make them most beneficial, which suggest that learning from explanations is a powerful principle that could offer a promising path towards training more robust and general machine learning systems.
SalKG: Learning From Knowledge Graph Explanations for Commonsense Reasoning
TLDR
SALKG, a simple framework for learning from KG explanations of both coarse and fine granularity, is proposed, which trains KG-augmented models to solve the task by focusing on KG information highlighted by the explanations as salient.
Learning from Natural Language Feedback
TLDR
This work proposes to learn from natural language feedback, which conveys more information per human evaluation than comparison feedback, and finetunes a GPT-3 model to roughly human-level summarization ability.
Supervising Model Attention with Human Explanations for Robust Natural Language Inference
TLDR
Using natural language explanations, the model is taught how a human would approach the NLI task, in order to learn features that will generalise better to previously unseen examples, improving model performance.
Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs
TLDR
It is suggested that models possess belief-like qualities to only a limited extent, but update methods can both fix incorrect model beliefs and greatly improve their consistency, and off-theshelf optimizers are surprisingly strong beliefupdating baselines.
Cross-Task Generalization via Natural Language Crowdsourcing Instructions
TLDR
This work introduces NATURAL INSTRUCTIONS, a dataset of 61 distinct tasks, their human-authored instructions, and 193k task instances, and adopts generative pre-trained language models to encode task-specific instructions along with input and generate task output.
Natural Instructions: Benchmarking Generalization to New Tasks from Natural Language Instructions
TLDR
This work uses the existing NLP datasets and the instructions used to crowdsource them to create NATURALINSTRUCTIONS, a dataset of instructions and task-specific input/output data that indicates that the existing models indeed benefit from instructions and hence, show improved generalization to new tasks.
Bridging Code-Text Representation Gap using Explanation
TLDR
This is the first work to define and categorize code explanation, for enhancing code understanding/representation, and confirms that even automatically generated explanation can lead to a drastic performance gain.
In-BoXBART: Get Instructions into Biomedical Multi-Task Learning
TLDR
This is the first attempt to propose a unified model in the biomedical domain and use instructions to achieve generalization across several biomedical tasks, and indicates that there is room for improvement across tasks in the BoX, implying the scope for future research direction.
...
1
2
...

References

SHOWING 1-10 OF 62 REFERENCES
Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations
TLDR
This work introduces a method for efficiently explaining and regularizing differentiable models by examining and selectively penalizing their input gradients, which provide a normal to the decision boundary.
Supervising Model Attention with Human Explanations for Robust Natural Language Inference
TLDR
Using natural language explanations, the model is taught how a human would approach the NLI task, in order to learn features that will generalise better to previously unseen examples, improving model performance.
Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language?
TLDR
A leakage-adjusted simulatability (LAS) metric is introduced for evaluating NL explanations, which measures how well explanations help an observer predict a model’s output, while controlling for how explanations can directly leak the output.
e-SNLI: Natural Language Inference with Natural Language Explanations
TLDR
The Stanford Natural Language Inference dataset is extended with an additional layer of human-annotated natural language explanations of the entailment relations, which can be used for various goals, such as obtaining full sentence justifications of a model’s decisions, improving universal sentence representations and transferring to out-of-domain NLI datasets.
Learning Classifiers from Declarative Language
TLDR
This work uses semantic parsing to map sentences to probabilistic assertions that are grounded in observable attributes of the data, and employs a training framework that depends on the differential associative strength of linguistic quantifiers.
Evaluating Explanations: How Much Do Explanations from the Teacher Aid Students?
TLDR
This work introduces a framework to quantify the value of explanations via the accuracy gains that they confer on a student model trained to simulate a teacher model, enabling principled, automatic, model-agnostic evaluation of attributions.
LIREx: Augmenting Language Inference with Relevant Explanation
TLDR
Qualitative analysis shows that LIREx generates flexible, faithful, and relevant NLEs that allow the model to be more robust to spurious explanations, and achieves significantly better performance than previous studies when transferred to the out-of-domain MultiNLI data set.
Learning from Task Descriptions
TLDR
This work introduces a framework for developing NLP systems that solve new tasks after reading their descriptions, synthesizing prior work in this area, and instantiates it with a new English language dataset, ZEST, structured for task-oriented evaluation on unseen tasks.
Learning with Latent Language
TLDR
This paper aims to show that using the space of natural language strings as a parameter space is an effective way to capture natural task structure and shows that, in all settings, models with a linguistic parameterization outperform those without.
Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded
TLDR
This work proposes a generic approach called Human Importance-aware Network Tuning (HINT), which effectively leverages human demonstrations to improve visual grounding and encourages deep networks to be sensitive to the same input regions as humans.
...
1
2
3
4
5
...