• Corpus ID: 53231403

YASENN: Explaining Neural Networks via Partitioning Activation Sequences

  title={YASENN: Explaining Neural Networks via Partitioning Activation Sequences},
  author={Yaroslav Zharov and Denis Korzhenkov and Pavel Shvechikov and Alexander Tuzhilin},
We introduce a novel approach to feed-forward neural network interpretation based on partitioning the space of sequences of neuron activations. In line with this approach, we propose a model-specific interpretation method, called YASENN. Our method inherits many advantages of model-agnostic distillation, such as an ability to focus on the particular input region and to express an explanation in terms of features different from those observed by a neural network. Moreover, examination of… 
Review Study of Interpretation Methods for Future Interpretable Machine Learning
A review of the current interpretable methods and divide them based on the model being applied, which aims to help the researcher find a suitable model to solve interpretability problems more easily.
How Case-Based Reasoning Explains Neural Networks: A Theoretical Analysis of XAI Using Post-Hoc Explanation-by-Example from a Survey of ANN-CBR Twin-Systems
The twin-systems approach is advanced as one possible coherent, generic solution to the XAI problem, and the paper concludes by road-mapping future directions for this XAI solution, considering further tests of feature-weighting techniques.
Research on Explainable Artificial Intelligence Techniques: An User Perspective
This paper aims to evaluate the comprehensibility of the explanations from the perspective of the different types of users, aiming to generate confidence and understanding of the results produced by the AI.
Case-Based Reasoning Research and Development: 27th International Conference, ICCBR 2019, Otzenhausen, Germany, September 8–12, 2019, Proceedings
The Challenges and Opportunities of CBR for eXplainable AI are mapped and the opportunities and challenges are mapped out.
How Case Based Reasoning Explained Neural Networks: An XAI Survey of Post-Hoc Explanation-by-Example in ANN-CBR Twins
It is argued that this twin-system approach, especially using ANN-CBR twins, presents one possible coherent, generic solution to the XAI problem (and, indeed, XCBR problem) and some future directions for this XAI solution are road-mapped.


Deep Learning for Case-based Reasoning through Prototypes: A Neural Network that Explains its Predictions
This work creates a novel network architecture for deep learning that naturally explains its own reasoning for each prediction, and the explanations are loyal to what the network actually computes.
Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values
Somewhat surprisingly, it is found that DNNs with randomly-initialized weights produce explanations that are both visually and quantitatively similar to those produced by DNN's with learned weights.
Distilling the Knowledge in a Neural Network
This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.
On the importance of single directions for generalization
It is found that class selectivity is a poor predictor of task importance, suggesting not only that networks which generalize well minimize their dependence on individual units by reducing their selectivity, but also that individually selective units may not be necessary for strong network performance.
Towards better understanding of gradient-based attribution methods for Deep Neural Networks
This work analyzes four gradient-based attribution methods and formally prove conditions of equivalence and approximation between them, and constructs a unified framework which enables a direct comparison, as well as an easier implementation.
Interpretable Convolutional Neural Networks
A method to modify a traditional convolutional neural network into an interpretable CNN, in order to clarify knowledge representations in high conv-layers of the CNN, which can help people understand the logic inside a CNN.
Insights on representational similarity in neural networks with canonical correlation
Comparing different neural network representations and determining how representations evolve over time remain challenging open questions in our understanding of the function of neural networks.
Extracting Tree-Structured Representations of Trained Networks
This work presents a novel algorithm, TREPAN, for extracting comprehensible, symbolic representations from trained neural networks, which is general in its applicability and scales well to large networks and problems with high-dimensional input spaces.
Transparent Model Distillation
This work investigates model distillation for transparency -- investigating if fully-connected neural networks can be distilled into models that are transparent or interpretable in some sense, and tries two types of student models.
Learning to Explain: An Information-Theoretic Perspective on Model Interpretation
An efficient variational approximation to the mutual information is developed, and the effectiveness of the method is shown on a variety of synthetic and real data sets using both quantitative metrics and human evaluation.