• Corpus ID: 233393922

Weakly Supervised Multi-task Learning for Concept-based Explainability

@article{Belem2021WeaklySM,
  title={Weakly Supervised Multi-task Learning for Concept-based Explainability},
  author={Catarina Bel'em and Vladimir Balayan and Pedro Saleiro and P. Bizarro},
  journal={ArXiv},
  year={2021},
  volume={abs/2104.12459}
}
In ML-aided decision-making tasks, such as fraud detection or medical diagnosis, the human-in-the-loop, usually a domain-expert without technical ML knowledge, prefers high-level concept-based explanations instead of low-level explanations based on model features. To obtain faithful concept-based explanations, we leverage multi-task learning to train a neural network that jointly learns to predict a decision task based on the predictions of a precedent explainability task (i.e., multi-label… 

Figures from this paper

Concept Bottleneck Model With Additional Unsupervised Concepts
TLDR
A novel interpretable model based on the concept bottleneck model with additional unsupervised concepts (CBM-AUC) is proposed that outperformed CBM and SENN and visualized the saliency map of each concept and confirmed that it was consistent with the semantic meanings.
Weakly supervised explanation generation for computer aided diagnostic systems
TLDR
A novel approach is proposed to explain the classification decision by providing heatmaps denoting important regions of interest in the image, which helped the model make the prediction, and to generate anatomically accurate heatmaps.
Concept Embedding Analysis: A Review
TLDR
A general definition of CA and a taxonomy for CA methods are established, uniting several ideas from literature, which allows to easy position and compare CA approaches.
Interactive Disentanglement: Learning Concepts by Interacting with their Prototype Representations
TLDR
This work introduces interactive Concept Swapping Networks (iCSNs), a novel framework for learning concept-grounded representations via weak supervision and implicit prototype representations that facilitates human understanding and human-machine interaction.

References

SHOWING 1-10 OF 17 REFERENCES
Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks
TLDR
This paper proposes a novel model dubbed the Piecewise Convolutional Neural Networks (PCNNs) with multi-instance learning to address the problem of wrong label problem when using distant supervision for relation extraction and adopts convolutional architecture with piecewise max pooling to automatically learn relevant features.
Towards Automatic Concept-based Explanations
TLDR
This work proposes principles and desiderata for concept based explanation, which goes beyond per-sample features to identify higher-level human-understandable concepts that apply across the entire dataset.
A Survey on Transfer Learning
TLDR
The relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift are discussed.
A Unified Approach to Interpreting Model Predictions
TLDR
A unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations), which unifies six existing methods and presents new methods that show improved computational performance and/or better consistency with human intuition than previous approaches.
Characterizing and Avoiding Negative Transfer
TLDR
A novel technique is proposed to circumvent negative transfer by filtering out unrelated source data based on adversarial networks, which is highly generic and can be applied to a wide range of transfer learning algorithms.
A Survey on Multi-Task Learning
TLDR
A survey for MTL is given, which classifies different MTL algorithms into several categories, including feature learning approach, low-rank approach, task clustering approaches, task relation learning approaches, and decomposition approach, and then discusses the characteristics of each approach.
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
TLDR
LIME is proposed, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning aninterpretable model locally varound the prediction.
Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)
TLDR
Concept Activation Vectors (CAVs) are introduced, which provide an interpretation of a neural net's internal state in terms of human-friendly concepts, and may be used to explore hypotheses and generate insights for a standard image classification network as well as a medical application.
Towards Robust Interpretability with Self-Explaining Neural Networks
TLDR
This work designs self-explaining models in stages, progressively generalizing linear classifiers to complex yet architecturally explicit models, and proposes three desiderata for explanations in general – explicitness, faithfulness, and stability.
Multi-Task Learning for Dense Prediction Tasks: A Survey.
TLDR
This survey provides a well-rounded view on state-of-the-art deep learning approaches for MTL in computer vision, explicitly emphasizing on dense prediction tasks.
...
...