• Corpus ID: 49482223

This looks like that: deep learning for interpretable image recognition

@inproceedings{Chen2019ThisLL,
  title={This looks like that: deep learning for interpretable image recognition},
  author={Chaofan Chen and Oscar Li and Alina Barnett and Jonathan Su and Cynthia Rudin},
  booktitle={NeurIPS},
  year={2019}
}
When we are faced with challenging image classification tasks, we often explain our reasoning by dissecting the image, and pointing out prototypical aspects of one class or another. [...] Key Method The network uses only image-level labels for training, meaning that there are no labels for parts of images. We demonstrate the method on the CIFAR-10 dataset and 10 classes from the CUB-200-2011 dataset.Expand
Interpretable Image Recognition by Constructing Transparent Embedding Space
Humans usually explain their reasoning (e.g. classification) by dissecting the image and pointing out the evidence from these parts to the concepts in their minds. Inspired by this cognitive process,
These do not Look Like Those: An Interpretable Deep Learning Model for Image Recognition
TLDR
A deep learning model: negative-positive prototypical part network (NP-ProtoPNet) that attempts to imitate human reasoning for image recognition while comparing the parts of a test image with the corresponding parts of the images from known classes.
MACE: Model Agnostic Concept Extractor for Explaining Image Classification Networks
TLDR
The MACE framework dissects the feature maps generated by a convolution network for an image to extract concept-based prototypical explanations and estimates the relevance of the extracted concepts to the pretrained model’s predictions, a critical aspect for explaining the individual class predictions, missing in existing approaches.
Intrinsically Interpretable Image Recognition with Neural Prototype Trees
Prototype-based methods use interpretable representations to address the black-box nature of deep learning models, in contrast to post-hoc explanation methods that only approximate such models. We
Region Comparison Network for Interpretable Few-shot Image Classification
TLDR
A metric learning based method named Region Comparison Network (RCN) is proposed, able to reveal how few-shot learning works as in a neural network as well as to find out specific regions that are related to each other in images coming from the query and support sets.
Evaluating CNN interpretability on sketch classification
TLDR
This paper investigates the performance and interpretability of BagNet on a data set of human sketches, and introduces a heatmap interpretability score (HI score) to quantify model interpretability and presents a user study to examine BagNet interpretability from user perspective.
Evaluating CNN Interpretabilty on Sketch Classification
While deep neural networks (DNNs) have been shown to outperform humans on many vision tasks, their intransparent decision making process inhibits wide-spread uptake, especially in high-risk
Interpretable and Accurate Fine-grained Recognition via Region Grouping
  • Zixuan Huang, Yin Li
  • Computer Science
    2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2020
TLDR
This work presents an interpretable deep model for fine-grained visual recognition that compares favourably to state-of-the-art methods on classification tasks, and outperforms previous approaches on the localization of object parts.
Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet
TLDR
A high-performance DNN architecture on ImageNet whose decisions are considerably easier to explain is introduced, and behaves similar to state-of-the art deep neural networks such as VGG-16, ResNet-152 or DenseNet-169 in terms of feature sensitivity, error distribution and interactions between image parts.
Focus Longer to See Better: Recursively Refined Attention for Fine-Grained Image Classification
TLDR
This paper shows through interpretability techniques how the network focus changes from coarse to fine details, and shows that a simple attention model can aggregate (weighted) these finer details to focus on the most dominant discriminative part of the image.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 84 REFERENCES
Part-Stacked CNN for Fine-Grained Visual Categorization
TLDR
A novel Part-Stacked CNN architecture that explicitly explains the finegrained recognition process by modeling subtle differences from object parts is proposed, from multiple perspectives of classification accuracy, model interpretability, and efficiency.
Visualizing and Understanding Convolutional Networks
TLDR
A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark.
Network Dissection: Quantifying Interpretability of Deep Visual Representations
TLDR
This work uses the proposed Network Dissection method to test the hypothesis that interpretability is an axis-independent property of the representation space, then applies the method to compare the latent representations of various networks when trained to solve different classification problems.
Learning Deep Features for Discriminative Localization
In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network (CNN) to have remarkable localization ability
Neural Activation Constellations: Unsupervised Part Model Discovery with Convolutional Networks
TLDR
An approach is presented that is able to learn part models in a completely unsupervised manner, without part annotations and even without given bounding boxes during learning, to find constellations of neural activation patterns computed using convolutional neural networks.
Learning Multiple Layers of Features from Tiny Images
TLDR
It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.
Striving for Simplicity: The All Convolutional Net
TLDR
It is found that max-pooling can simply be replaced by a convolutional layer with increased stride without loss in accuracy on several image recognition benchmarks.
Visualizing Higher-Layer Features of a Deep Network
TLDR
This paper contrast and compare several techniques applied on Stacked Denoising Autoencoders and Deep Belief Networks, trained on several vision datasets, and shows that good qualitative interpretations of high level features represented by such models are possible at the unit level.
Interpretable Basis Decomposition for Visual Explanation
TLDR
A new framework called Interpretable Basis Decomposition for providing visual explanations for classification networks is proposed, decomposing the neural activations of the input image into semantically interpretable components pre-trained from a large concept corpus.
From image-level to pixel-level labeling with Convolutional Networks
TLDR
A Convolutional Neural Network-based model is proposed, which is constrained during training to put more weight on pixels which are important for classifying the image, and which beats the state of the art results in weakly supervised object segmentation task by a large margin.
...
1
2
3
4
5
...