Interpretability of deep learning models: A survey of results

@article{Chakraborty2017InterpretabilityOD,
  title={Interpretability of deep learning models: A survey of results},
  author={Supriyo Chakraborty and Richard J. Tomsett and Ramya Raghavendra and Daniel Harborne and Moustafa Farid Alzantot and F. Cerutti and Mani B. Srivastava and Alun David Preece and Simon J. Julier and Raghuveer M. Rao and Troy D. Kelley and Dave Braines and M. Sensoy and Chris J. Willis and Prudhvi K. Gurram},
  journal={2017 IEEE SmartWorld, Ubiquitous Intelligence \& Computing, Advanced \& Trusted Computed, Scalable Computing \& Communications, Cloud \& Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI)},
  year={2017},
  pages={1-6}
}
  • Supriyo Chakraborty, Richard J. Tomsett, Prudhvi K. Gurram
  • Published 1 August 2017
  • Computer Science
  • 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI)
Deep neural networks have achieved near-human accuracy levels in various types of classification and prediction tasks including images, text, speech, and video data. However, the networks continue to be treated mostly as black-box function approximators, mapping a given input to a classification output. The next step in this human-machine evolutionary process — incorporating these networks into mission critical processes such as medical diagnosis, planning and control — requires a level of… 
NeuroMask: Explaining Predictions of Deep Neural Networks through Mask Learning
TLDR
A novel method, NeuroMask, for generating an interpretable explanation of classification model results is presented and is found to be both accurate and interpretable.
An Interpretable Deep Architecture for Similarity Learning Built Upon Hierarchical Concepts
TLDR
An effective similarity neural network (SNN) is proposed not only to seek robust retrieval performance but also to achieve satisfactory post-hoc interpretability, and can offer superior performance when compared against state-of-the-art approaches.
Interpretability in Deep Learning for Heston Smile Modeling Calibration
TLDR
The purpose of this project is to build a neural network reproducing the relationship between volatility smiles and model parameters in the Heston model, and evaluate the performance of this network and apply interpretability models to the neural network obtained, and particularly find out which input feature impacts the model output the most.
Interpretable Convolutional Filters with SincNet
TLDR
This paper proposes SincNet, a novel Convolutional Neural Network that encourages the first layer to discover more meaningful filters by exploiting parametrized sinc functions, and shows that the proposed architecture converges faster, performs better, and is more interpretable than standard CNNs.
On Interpretability of Artificial Neural Networks: A Survey
TLDR
A simple but comprehensive taxonomy for interpretability is proposed, systematically review recent studies on interpretability of neural networks, describe applications of interpretability in medicine, and discuss future research directions, such as in relation to fuzzy logic and brain science.
MARGIN: Uncovering Deep Neural Networks Using Graph Signal Analysis
TLDR
A simple yet general approach to address a large set of interpretability tasks MARGIN exploits ideas rooted in graph signal analysis to determine influential nodes in a graph, which are defined as those nodes that maximally describe a function defined on the graph.
Local Interpretations for Explainable Natural Language Processing: A Survey
TLDR
This work investigates various methods to improve the interpretability of deep neural networks for natural language processing (NLP) tasks, including machine translation and sentiment analysis.
Interpretability in deep learning for finance: a case study for the Heston model
TLDR
This paper focuses on the calibration process of a stochastic volatility model, a subject recently tackled by deep learning algorithms, and finds that global strategies such as Shapley values can be effectively used in practice.
On Interpretability of Artificial Neural Networks
TLDR
This work systematically review recent studies in understanding the mechanism of neural networks and shed light on some future directions of interpretability research.
Definitions, methods, and applications in interpretable machine learning
TLDR
This work defines interpretability in the context of machine learning and introduces the predictive, descriptive, relevant (PDR) framework for discussing interpretations, and introduces 3 overarching desiderata for evaluation: predictive accuracy, descriptive accuracy, and relevancy.
...
...

References

SHOWING 1-10 OF 47 REFERENCES
Evaluating the Visualization of What a Deep Neural Network Has Learned
TLDR
A general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps and shows that the recently proposed layer-wise relevance propagation algorithm qualitatively and quantitatively provides a better explanation of what made a DNN arrive at a particular classification decision than the sensitivity-based approach or the deconvolution method.
Visualizing Higher-Layer Features of a Deep Network
TLDR
This paper contrast and compare several techniques applied on Stacked Denoising Autoencoders and Deep Belief Networks, trained on several vision datasets, and shows that good qualitative interpretations of high level features represented by such models are possible at the unit level.
Opening the Black Box of Deep Neural Networks via Information
TLDR
This work demonstrates the effectiveness of the Information-Plane visualization of DNNs and shows that the training time is dramatically reduced when adding more hidden layers, and the main advantage of the hidden layers is computational.
Synthesizing the preferred inputs for neurons in neural networks via deep generator networks
TLDR
This work dramatically improves the qualitative state of the art of activation maximization by harnessing a powerful, learned prior: a deep generator network (DGN), which generates qualitatively state-of-the-art synthetic images that look almost real.
Streaming Weak Submodularity: Interpreting Neural Networks on the Fly
TLDR
This paper casts interpretability of black-box classifiers as a combinatorial maximization problem and proposes an efficient streaming algorithm to solve it subject to cardinality constraints and provides a constant factor approximation guarantee for this general class of functions.
Visualizing and Understanding Recurrent Networks
TLDR
This work uses character-level language models as an interpretable testbed to provide an analysis of LSTM representations, predictions and error types, and reveals the existence of interpretable cells that keep track of long-range dependencies such as line lengths, quotes and brackets.
Understanding Neural Networks Through Deep Visualization
TLDR
This work introduces several new regularization methods that combine to produce qualitatively clearer, more interpretable visualizations of convolutional neural networks.
Towards Bayesian Deep Learning: A Survey
TLDR
A general introduction to Bayesian deep learning is provided and its recent applications on recommender systems, topic models, and control are reviewed.
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
TLDR
LIME is proposed, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning aninterpretable model locally varound the prediction.
...
...