LFI-CAM: Learning Feature Importance for Better Visual Explanation

@article{Lee2021LFICAMLF,
  title={LFI-CAM: Learning Feature Importance for Better Visual Explanation},
  author={Kwang Hee Lee and Chaewon Park and Jung Hyun Oh and Nojun Kwak},
  journal={2021 IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2021},
  pages={1335-1343}
}
Class Activation Mapping (CAM) is a powerful technique used to understand the decision making of Convolutional Neural Network (CNN) in computer vision. Recently, there have been attempts not only to generate better visual explanations, but also to improve classification performance using visual explanations. However, previous works still have their own drawbacks. In this paper, we propose a novel architecture, LFI-CAM*** (Learning Feature Importance Class Activation Mapping), which is trainable… 
Bag-level classification network for infrared target detection
TLDR
This paper explores the feasibility of learning a pixel-level classification scheme given only image-level label information, and investigates the use of class activation maps to inform feature selection for binary, pixel- level classification tasks.
Time-Aware and Feature Similarity Self-Attention in Vessel Fuel Consumption Prediction
TLDR
The ensemble model of TA and FA-based BiLSTM, which consists of fully connected layers, is capable of simultaneously capturing different properties of ship data and shows that the proposed model improves the performance in predicting fuel consumption.
Extending the Abstraction of Personality Types based on MBTI with Machine Learning and Natural Language Processing
TLDR
The results showed that attention to the data iteration loop focused on quality, explanatory power and representativeness for the abstraction of more relevant/important resources for the studied phenomenon made it possible to improve the evaluation metrics results more quickly and less costly than complex models such as the LSTM or state of the art ones as BERT.

References

SHOWING 1-10 OF 23 REFERENCES
ImageNet: A large-scale hierarchical image database
TLDR
A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks
  • Haofan Wang, Zifan Wang, Xia Hu
  • Computer Science
    2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
  • 2020
TLDR
This paper develops a novel post-hoc visual explanation method called Score-CAM based on class activation mapping that outperforms previous methods on both recognition and localization tasks, it also passes the sanity check.
Attention Branch Network: Learning of Attention Mechanism for Visual Explanation
TLDR
Attention Branch Network (ABN) is proposed, which extends a response-based visual explanation model by introducing a branch structure with an attention mechanism and is trainable for visual explanation and image recognition in an end-to-end manner.
Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks
TLDR
This paper proposes Grad-CAM++, which uses a weighted combination of the positive partial derivatives of the last convolutional layer feature maps with respect to a specific class score as weights to generate a visual explanation for the class label under consideration, to provide better visual explanations of CNN model predictions.
Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
TLDR
This work proposes a technique for producing ‘visual explanations’ for decisions from a large class of Convolutional Neural Network (CNN)-based models, making them more transparent and explainable, and shows that even non-attention based models learn to localize discriminative regions of input image.
Learning Deep Features for Discriminative Localization
In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network (CNN) to have remarkable localization ability
RISE: Randomized Input Sampling for Explanation of Black-box Models
TLDR
The problem of Explainable AI for deep neural networks that take images as input and output a class probability is addressed and an approach called RISE that generates an importance map indicating how salient each pixel is for the model's prediction is proposed.
Squeeze-and-Excitation Networks
TLDR
This work proposes a novel architectural unit, which is term the “Squeeze-and-Excitation” (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels and shows that these blocks can be stacked together to form SENet architectures that generalise extremely effectively across different datasets.
Mask R-CNN
TLDR
This work presents a conceptually simple, flexible, and general framework for object instance segmentation that outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners.
Methods for interpreting and understanding deep neural networks
...
...