AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models

@article{Wallace2019AllenNLPIA,
  title={AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models},
  author={Eric Wallace and Jens Tuyls and Junlin Wang and Sanjay Subramanian and Matt Gardner and Sameer Singh},
  journal={ArXiv},
  year={2019},
  volume={abs/1909.09251}
}
Neural NLP models are increasingly accurate but are imperfect and opaque---they break in counterintuitive ways and leave end users puzzled at their behavior. [...] Key Method The toolkit provides interpretation primitives (e.g., input gradients) for any AllenNLP model and task, a suite of built-in interpretation methods, and a library of front-end visualization components. We demonstrate the toolkit's flexibility and utility by implementing live demos for five interpretation methods (e.g., saliency maps and…Expand
Interpreting Predictions of NLP Models
TLDR
This tutorial will provide a background on interpretation techniques, i.e., methods for explaining the predictions of NLP models, and present a thorough study of example-specific interpretations, including saliency maps, input perturbations, and influence functions. Expand
Explaining Black Box Predictions and Unveiling Data Artifacts through Influence Functions
TLDR
It is found that influence functions are particularly useful for natural language inference, a task in which ‘saliency maps’ may not have clear interpretation, and a new quantitative measure based on influence functions that can reveal artifacts in training data is developed. Expand
Are Interpretations Fairly Evaluated? A Definition Driven Pipeline for Post-Hoc Interpretability
TLDR
It is proposed that it is crucial to have a concrete definition of interpretation before the authors could evaluate faithfulness of an interpretation, and it is found that although interpretation methods perform differently under a certain evaluation metric, such a difference may not result from interpretation quality or faithfulness, but rather the inherent bias of the evaluation metric. Expand
InterpreT: An Interactive Visualization Tool for Interpreting Transformers
TLDR
InterpreT is an interactive visualization tool for interpreting Transformer-based models, and its functionalities are demonstrated through the analysis of model behaviours for two disparate tasks: Aspect Based Sentiment Analysis and the Winograd Schema Challenge. Expand
SQuAD2-CR: Semi-supervised Annotation for Cause and Rationales for Unanswerability in SQuAD 2.0
TLDR
SQuAD2-CR dataset is released, which contains annotations on unanswerable questions from the SQuAD 2.0 dataset, to enable an explanatory analysis of the model prediction and annotate explanation on why the most plausible answer span cannot be the answer and which part of the question causes unanswerability. Expand
Gradient-based Analysis of NLP Models is Manipulable
TLDR
This paper merge the layers of a target model with a Facade Model that overwhelms the gradients without affecting the predictions, and shows that the merged model effectively fools different analysis tools. Expand
The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models
TLDR
The Language Interpretability Tool (LIT), an open-source platform for visualization and understanding of NLP models, is presented, which integrates local explanations, aggregate analysis, and counterfactual generation into a streamlined, browser-based interface to enable rapid exploration and error analysis. Expand
Can BERT Reason? Logically Equivalent Probes for Evaluating the Inference Capabilities of Language Models
TLDR
It is found that despite the recent success of large PTLMs on commonsense benchmarks, their performances on probes are no better than random guessing (even with fine-tuning) and are heavily dependent on biases--the poor overall performance inhibits us from studying robustness. Expand
A Responsible Machine Learning Workflow with Focus on Interpretable Models, Post-hoc Explanation, and Discrimination Testing
TLDR
A template workflow for machine learning applications that require high accuracy and interpretability and that mitigate risks of discrimination is provided to provide a viable approach for training and evaluating machine learning systems for high-stakes, human-centered, or regulated applications using common Python programming tools. Expand
Why do you think that? Exploring faithful sentence–level rationales without supervision
TLDR
This work proposes a differentiable training–framework to create models which output faithful rationales on a sentence level, by solely applying supervision on the target task, and exploits the transparent decision–making process of these models to prefer selecting the correct rationales by applying direct supervision, thereby boosting the performance on the rationale–level. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 25 REFERENCES
Pathologies of Neural Models Make Interpretations Difficult
TLDR
This work uses input reduction, which iteratively removes the least important word from the input, to expose pathological behaviors of neural models: the remaining words appear nonsensical to humans and are not the ones determined as important by interpretation methods. Expand
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
TLDR
LIME is proposed, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning aninterpretable model locally varound the prediction. Expand
Annotation Artifacts in Natural Language Inference Data
TLDR
It is shown that a simple text categorization model can correctly classify the hypothesis alone in about 67% of SNLI and 53% of MultiNLI, and that specific linguistic phenomena such as negation and vagueness are highly correlated with certain inference classes. Expand
AllenNLP: A Deep Semantic Natural Language Processing Platform
TLDR
AllenNLP is designed to support researchers who want to build novel language understanding models quickly and easily and provides a flexible data API that handles intelligent batching and padding, and a modular and extensible experiment framework that makes doing good science easy. Expand
Axiomatic Attribution for Deep Networks
We study the problem of attributing the prediction of a deep network to its input features, a problem previously studied by several other works. We identify two fundamental axioms— Sensitivity andExpand
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TLDR
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks. Expand
Visual Interrogation of Attention-Based Models for Natural Language Inference and Machine Comprehension
TLDR
A flexible visualization library for creating customized visual analytic environments, in which the user can investigate and interrogate the relationships among the input, the model internals, and the output predictions, which in turn shed light on the model decision-making process. Expand
Visualizing and Understanding Recurrent Networks
TLDR
This work uses character-level language models as an interpretable testbed to provide an analysis of LSTM representations, predictions and error types, and reveals the existence of interpretable cells that keep track of long-range dependencies such as line lengths, quotes and brackets. Expand
DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs
TLDR
A new reading comprehension benchmark, DROP, which requires Discrete Reasoning Over the content of Paragraphs, and presents a new model that combines reading comprehension methods with simple numerical reasoning to achieve 51% F1. Expand
QADiver: Interactive Framework for Diagnosing QA Models
TLDR
A web-based UI that provides how each model contributes to QA performances, by integrating visualization and analysis tools for model explanation is proposed and it is expected this framework can help QA model researchers to refine and improve their models. Expand
...
1
2
3
...