Convolutional Dynamic Alignment Networks for Interpretable Classifications

@article{Boehle2021ConvolutionalDA,
  title={Convolutional Dynamic Alignment Networks for Interpretable Classifications},
  author={Moritz D Boehle and Mario Fritz and Bernt Schiele},
  journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021},
  pages={10024-10033}
}
We introduce a new family of neural network models called Convolutional Dynamic Alignment Networks 1 (CoDA-Nets), which are performant classifiers with a high degree of inherent interpretability. Their core building blocks are Dynamic Alignment Units (DAUs), which linearly transform their input with weight vectors that dynamically align with task-relevant patterns. As a result, CoDA-Nets model the classification prediction through a series of input-dependent linear transformations, allowing for… Expand
1 Citations
A comparison of deep saliency map generators on multispectral data in object detection
TLDR
This work tries to close the gaps by investigating three saliency map generator methods on how their maps differ in the different spectra, and examines how they perform when used for object detection. Expand

References

SHOWING 1-10 OF 50 REFERENCES
Visualizing and Understanding Convolutional Networks
TLDR
A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark. Expand
Striving for Simplicity: The All Convolutional Net
TLDR
It is found that max-pooling can simply be replaced by a convolutional layer with increased stride without loss in accuracy on several image recognition benchmarks. Expand
Dynamic Filter Networks
TLDR
The Dynamic Filter Network is introduced, where filters are generated dynamically conditioned on an input, and it is shown that this architecture is a powerful one, with increased flexibility thanks to its adaptive nature, yet without an excessive increase in the number of model parameters. Expand
Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
TLDR
This work proposes a technique for producing ‘visual explanations’ for decisions from a large class of Convolutional Neural Network (CNN)-based models, making them more transparent and explainable, and shows that even non-attention based models learn to localize discriminative regions of input image. Expand
Learning Important Features Through Propagating Activation Differences
TLDR
DeepLIFT (Deep Learning Important FeaTures), a method for decomposing the output prediction of a neural network on a specific input by backpropagating the contributions of all neurons in the network to every feature of the input, is presented. Expand
Full-Gradient Representation for Neural Network Visualization
TLDR
This work introduces a new tool for interpreting neural nets, namely full-gradients, which decomposes the neural net response into input sensitivity and per-neuron sensitivity components, and proposes an approximate saliency map representation for convolutional nets dubbed FullGrad, obtained by aggregating the full-gradient components. Expand
Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps
TLDR
This paper addresses the visualisation of image classification models, learnt using deep Convolutional Networks (ConvNets), and establishes the connection between the gradient-based ConvNet visualisation methods and deconvolutional networks. Expand
This looks like that: deep learning for interpretable image recognition
TLDR
A deep network architecture -- prototypical part network (ProtoPNet), that reasons in a similar way to the way ornithologists, physicians, and others would explain to people on how to solve challenging image classification tasks, that provides a level of interpretability that is absent in other interpretable deep models. Expand
Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet
TLDR
A high-performance DNN architecture on ImageNet whose decisions are considerably easier to explain is introduced, and behaves similar to state-of-the art deep neural networks such as VGG-16, ResNet-152 or DenseNet-169 in terms of feature sensitivity, error distribution and interactions between image parts. Expand
On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation
TLDR
This work proposes a general solution to the problem of understanding classification decisions by pixel-wise decomposition of nonlinear classifiers by introducing a methodology that allows to visualize the contributions of single pixels to predictions for kernel-based classifiers over Bag of Words features and for multilayered neural networks. Expand
...
1
2
3
4
5
...