Interpretable and Steerable Sequence Learning via Prototypes

  title={Interpretable and Steerable Sequence Learning via Prototypes},
  author={Yao Ming and Panpan Xu and Huamin Qu and Liu Ren},
  journal={Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
  • Yao MingPanpan Xu Liu Ren
  • Published 23 July 2019
  • Computer Science
  • Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
One of the major challenges in machine learning nowadays is to provide predictions with not only high accuracy but also user-friendly explanations. [] Key Method The prediction is obtained by comparing the inputs to a few prototypes, which are exemplar cases in the problem domain. For better interpretability, we define several criteria for constructing the prototypes, including simplicity, diversity, and sparsity and propose the learning objective and the optimization procedure.

ProtoSteer: Steering Deep Sequence Model with Prototypes

This work tackles the challenge of directly involving the domain experts to steer a deep sequence model without relying on model developers as intermediaries and demonstrates that involvements of domain users can help obtain more interpretable models with concise prototypes while retaining similar accuracy.

Interpreting Convolutional Sequence Model by Learning Local Prototypes with Adaptation Regularization

A sequence modeling approach that explains its own predictions by breaking input sequences down into evidencing segments (i.e., sub-sequences) in its reasoning is developed and can achieve high interpretability generally, together with a competitive accuracy to the state-of-the-art approaches.

ProtoViewer: Visual Interpretation and Diagnostics of Deep Neural Networks with Factorized Prototypes

A novel visual analytics framework to interpret and diagnose DNNs by utilizing ProtoFac to factorize the latent representations in DNN’s into weighted combinations of prototypes, which are exemplar cases from the original data.

Interpreting Deep Neural Networks through Prototype Factorization

This work proposes ProtoFac, an explainable matrix factorization technique that decomposes the latent representations at any selected layer in a pre-trained DNN as a collection of weighted prototypes, which are a small number of exemplars extracted from the original data.

Semantic Explanation for Deep Neural Networks Using Feature Interactions

This work presents a method to generate semantic and quantitative explanations that are easily interpretable by humans and shows that including interactions not only generates explanations but also makes them richer and is able to convey more information.

Interpretable deep learning: interpretation, interpretability, trustworthiness, and beyond

This paper introduces and clarify two basic concepts—interpretations and interpretability—that people usually get confused about and summarizes the current works in evaluating models’ interpretability using “trustworthy” interpretation algorithms.

Human-in-the-loop Extraction of Interpretable Concepts in Deep Learning Models

A novel human-in-the-loop approach to generate user-defined concepts for model interpretation and diagnostics using active learning, where human knowledge and feedback are combined to train a concept extractor with very little human labeling effort is presented.

Learning to Select Prototypical Parts for Interpretable Sequential Data Modeling

This work proposes a Self-Explaining Selective Model (SESM) that uses a linear combination of prototypical concepts to explain its own predictions and designs multiple constraints including diversity, stability, and locality as training objectives.

Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges

This work provides fundamental principles for interpretable ML, and dispel common misunderstandings that dilute the importance of this crucial topic.

Interpretable Image Classification with Differentiable Prototypes Assignment

This work introduces ProtoPool, an interpretable prototype-based model with positive reasoning and three main novelties, and proposes a new focal similarity function that contrasts the prototype from the background and consequently concentrates on more salient visual features.



Deep Learning for Case-based Reasoning through Prototypes: A Neural Network that Explains its Predictions

This work creates a novel network architecture for deep learning that naturally explains its own reasoning for each prediction, and the explanations are loyal to what the network actually computes.

“Why Should I Trust You?”: Explaining the Predictions of Any Classifier

LIME is proposed, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning aninterpretable model locally varound the prediction.

Visualizing and Understanding Recurrent Networks

This work uses character-level language models as an interpretable testbed to provide an analysis of LSTM representations, predictions and error types, and reveals the existence of interpretable cells that keep track of long-range dependencies such as line lengths, quotes and brackets.

Sequence to Sequence Learning with Neural Networks

This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.

LSTMVis: A Tool for Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks

This work presents LSTMVis, a visual analysis tool for recurrent neural networks with a focus on understanding these hidden state dynamics, and describes the domain, the different stakeholders, and their goals and tasks.

Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs

The driving force behind the recent success of LSTMs has been their ability to learn complex and non-linear relationships. Consequently, our inability to describe these relationships has led to LSTMs

A Critical Review of Recurrent Neural Networks for Sequence Learning

The goal of this survey is to provide a selfcontained explication of the state of the art of recurrent neural networks together with a historical perspective and references to primary research.

Towards A Rigorous Science of Interpretable Machine Learning

This position paper defines interpretability and describes when interpretability is needed (and when it is not), and suggests a taxonomy for rigorous evaluation and exposes open questions towards a more rigorous science of interpretable machine learning.

Deep Residual Learning for Image Recognition

This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

Automatic Rule Extraction from Long Short Term Memory Networks

By identifying consistently important patterns of words, this paper is able to distill state of the art LSTMs on sentiment analysis and question answering into a set of representative phrases and quantitatively validated by using the extracted phrases to construct a simple, rule-based classifier which approximates the output of the LSTM.