Corpus ID: 235446704

Refining Language Models with Compositional Explanations

@inproceedings{Yao2021RefiningLM,
  title={Refining Language Models with Compositional Explanations},
  author={Huihan Yao and Ying Chen and Qinyuan Ye and Xisen Jin and Xiang Ren},
  year={2021}
}
Pre-trained language models have been successful on text classification tasks, but are prone to learning spurious correlations from biased datasets, and are thus vulnerable when making inferences in a new domain. Prior works reveal such spurious patterns via post-hoc explanation algorithms which compute the importance of input features. Further, the model is regularized to align the importance scores with human knowledge, so that the unintended model behaviors are eliminated. However, such a… Expand

References

SHOWING 1-10 OF 48 REFERENCES
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
TLDR
This work presents two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT, and uses a self-supervised loss that focuses on modeling inter-sentence coherence. Expand
Contextualizing Hate Speech Classifiers with Post-hoc Explanation
TLDR
This work extracts post-hoc explanations from fine-tuned BERT classifiers to detect bias towards identity terms and proposes a novel regularization technique based on these explanations that encourages models to learn from the context of group identifiers in addition to the identifiers themselves. Expand
Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation
TLDR
This work proposes a simple yet generic representation learning framework, named SHOT, which freezes the classifier module (hypothesis) of the source model and learns the target-specific feature extraction module by exploiting both information maximization and self-supervised pseudo-labeling to implicitly align representations from the target domains to the source hypothesis. Expand
Interpretations are useful: penalizing explanations to align neural networks with prior knowledge
For an explanation of a deep learning model to be effective, it must provide both insight into a model and suggest a corresponding action in order to achieve some objective. Too often, the litany ofExpand
Learning from Explanations with Neural Execution Tree
TLDR
A novel Neural Execution Tree (NExT) framework to augment training data for text classification using NL explanations by transforming NL explanations into executable logical forms by semantic parsing, which substantially increases the coverage of each NL explanation. Expand
Model Adaptation: Unsupervised Domain Adaptation Without Source Data
TLDR
This paper proposes a new framework, which is referred to as collaborative class conditional generative adversarial net, to bypass the dependence on the source data and achieves superior performance on multiple adaptation tasks with only unlabeled target data, which verifies its effectiveness in this challenging setting. Expand
Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models
TLDR
This paper proposes a formal and general way to quantify the importance of each word and phrase, and proposes Sampling and Contextual Decomposition (SCD) algorithm and Samplings and Occlusion (SOC) algorithm, which outperform prior hierarchical explanation algorithms. Expand
Transformers: State-of-the-Art Natural Language Processing
TLDR
Transformers is an open-source library that consists of carefully engineered state-of-the art Transformer architectures under a unified API and a curated collection of pretrained models made by and available for the community. Expand
A Survey of Methods for Explaining Black Box Models
TLDR
A classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box system is provided to help the researcher to find the proposals more useful for his own work. Expand
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TLDR
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks. Expand
...
1
2
3
4
5
...